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[X] Cancel in this application original claims 1-41 of the parent application before calculating the 
filing fee. (At least one original independent claim must be retained for filing purposes.) 

[X] A prelinunary Amendment is enclosed. (Claims added by this Amendment have been 

properly mmibered consecutively begiiming with the number following the highest numbered 
original claim in the prior application. 

The status of the parent apphcation is as follows: 

[ ] A Petition For Extension of Time and a Fee therefor has been or is being filed in die parent 

application to extend the term for action in the parent application imtil . 

[ ] A copy of the Petition for Extension of Time in the co-pending parent application is attached. 

[X] No Petition For Extension of Time and Fee therefor are necessary in the co-pending parent 
application. 
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[ ] Please abandon the parent application at a time while the parent application is pending or at a time 
when the petition for extension of time in that application is granted and while this application is 
pending has been granted a filing date, so as to make this application co-pending. 

[ ] Transfer the drawiiig(s) from the patent application to this application. 

[X] Amend the specification by inserting before the first line the sentence: 

This is a [ ] continuation [X] divisional [ ] continuation-in-part of co-pending application Serial No. 
09/014.416 . 
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OF THIS SHEET IS ATTACHED. 
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for filing this application, or credit any overpayment to Deposit Account No. 13-4500, Order No. 2026- 
4276US1 . A DUPLICATE COPY OF THIS SHEET IS ATTACHED. 
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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

Applicant(s) : Yanagi et al. Group Art Unit: To be assigned 

Serial No. Divisional of 09/014,416 Examiner: To be assigned 

Filed : September 14, 2000 

For : CLONED GENOMES OF INFECTIOUS HEPATITIS C 

VIRUSES AND USES THEREOF 

PRELIMINARY AMENDMENT 

COMMISSIONER OF PATENTS 
Washington, D.C. 20231 

Sir: 

Prior to examination on the merits, Applicants respectfully request entry of 
the following Preliminary Amendment. 
IN THE SPECIFICATION 

On page 1 , line 4, after the recitation of "This application", insert -- is a 
divisional of U.S. Serial No. 09/014,416 filed January 27, 1998 which -. 

On page 9, line 8 after recitation of "sequence" and prior to the recitation 
of "of a H77C clone" insert - (SEQ ID N0:2) -. 

On page 9, line 9 after recitation of "amino acid sequence" insert - (SEQ 

IDNO:1)-. 

On page 9, line 29 after recitation of "Figure 7" insert - A through 7D -. 

On page 10, line 20 after recitation of "HVRI" insert - (SEQ ID NOS:28, 
30, 32, 34, 36-38, 41, 43 and 45) -; at line 21 after recitation of ■'HVR2" insert - (SEQ ID 
NOS:29, 31, 33, 35, 39, 40, 42, 44 and 46) -. 
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On page 10, line 32 after recitation of "1b (pCV-J4L6S)." insert - 5' UTR 
for HC-J4 is SEQ ID NO:47, 5' UTR for pCV - J4L6S is SEQ ID NO:48, 5' UTR for pCV- 
H77C is SEQ ID NO: 49, 3' UTR - 3' variable region for HC-J4 is SEQ ID NO 50 and 53, 
3' UTR - 3' variable region for pCV-J4L6S is SEQ ID NO:51 and 54, 3' UTR - 3' variable 
region for pCV-H77C is SEQ ID NO:52 and 54; 3' UTR - 3' conserved region for H77, 
pCV-J4L6S and pCV - H77C is SEQ ID NO 55. 

On page 12, last line after recitation of Accession No. insert - 209596 -. 

On page 1 1 , line 20 after recitation of "strain HC-J4" insert - (SEQ ID 

NO:4) -. 

On page 11, line 21 after recitation of "amino acid sequence" insert - 
(SEQ ID N0:3) -. 

On page 1 1 , line 29 after recitation of "clone pH 77CV-J4" insert -- (SEQ 

ID N0:6) --. 

On page 1 1 , line 31 after recitation of "cliimeric la/lb clone" insert - (SEQ 

ID NO:5) -. 

On page 11, line 34 after recitation of "1a infectious clone pCV-H77C" 
insert -- (pCV-H77C has SEQ ID NOS:56, 57 and 58; pCV-H77C (-98X) has SEQ ID 
NO:59; pCV-H77C (-42X) has SEQ ID NO:60; pCV-H77C (X-52) has SEQ ID N0:61 ; 
pCV-H77C(X) has SEQ ID NO:62; pCV-H77 C(+49X) has SEQ ID NO:63; pCV-H77C 
(VR-24) has SEQ ID NO:64; and pCV-H77C (-U/UC) has SEQ ID NO:65). 

On page 29, Table 1, line 4, after recitation of "H9261F" insert -- SEQ ID 
NO:7 -; at line 5 after recitation of "H3' x 58R" insert - SEQ ID NO:8 --; at line 6 after 
recitation of "H9282F" insert -- SEQ ID N0:9 -; at line 7 after recitation "H3' X 45R" 
insert - SEQ ID NO:10 --; at line 8 after recitation of "H9375F" insert - SEQ ID NO:11 -; 
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at line 9 after recitation of "H3' X -35R" insert -- SEQ ID NO: 12; at line 1 0 after recitation 
of "H9386F" insert - SEQ ID NO: 13 --; at line 1 1 after recitation of "H3' X - 38R" insert -- 
SEQ ID N0:14 -; at line 12 after recitation of "H1" insert - SEQ ID NO:15 -; at line 13 
after recitation of "H9417R" insert - SEQ ID NO:17 -. 

On page 41, line 1 after recitation of "(5'-GCCTATTGGCCTGGAGTGGTT 
AGCTC - 3') insert - SEQ ID NO:18 --; at line 6 after recitation of: 
AGGATGGCCTTAAGG CCTGGAGTGGTTAGCTCGCCGTTCA - 3')" insert -- SEQ ID 
NO: 19--. 

On page 51, line 1, after recitation of "H2751S (Cla l/Nde I)" insert -SEQ 
ID NO:20 -; at line 3 after recitation of " H2870R" insert ~ SEQ ID N0:21 -; at line 5 
after recitation of "H7851S" insert - SEQ ID NO:22 -; at line 7 after recitation of "H9173 
R(P-M)'" insert - SEQ ID NO:23 ~; at line 9 after the recitation of "H9140S (P-M)" insert - 
- SEQ ID NO:24 ~; at line 1 1 after the recitation of "H9417R" insert ~ SEQ ID NO:25 -; 
at line 14 after recitation of "J4-2271S" insert ~ SEQ ID NO:26 -; at line 16 after 
recitation of "J4-2776R (Nde I)" insert - SEQ ID NO:27 ~. 

After page 62 of the "Abstract of the Disclosure" insert - Sequence Listing 
~ page number 1 through 61 . 
IN THE CLAIMS 

Please cancel claims 1-41 without prejudice. 

Please amend the following claims: 

42. (Amended) A composition comprising a purified and isolated 
nucleic acid molecule [of claim 1] suspended in a suitable amount of a pharmaceutically 
acceptable diluent or excipient^ said nucleic acid molecule encodes human hepatitis C 
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vims, wherein expression of said molecule in transfected cells results in production of 
virus when transfected into cells . 

43. (Amended) A method for treating hepatitis C viral infection 
comprising the administration to [a] an animal in need thereof of a clinically effective 
amount of the composition of claim 42. 

Please add the following new claims: 

-- 44. The composition of claim 42, wherein the molecule encodes the 
amino acid sequence of SEQ ID NO:3 shown in Figures 14G-14H. 

45. The composition of claim 42, wherein the molecule comprises the 
nucleic acid sequence of SEQ ID N0:4 shown in Figures 14A-14F. 

46. The composition of claim 42, wherein the molecule encodes the 
amino acid sequence of SEQ ID NO:1 shown in Figures 4G-4H. 

47. The composition of claim 42, wherein the molecule comprises the 
nucleic acid sequence of SEQ ID N0:2 shown in Figures 4A-4F. 

48. A composition comprising a purified and isolated nucleic acid 
molecule suspended in a suitable amount of a pharmaceutically acceptable diluent or 
excipient, said nucleic acid molecule encodes human hepatitis C virus, wherein 
expression of said molecule in transfected cells results in production of virus, wherein a 
fragment of said molecule which encodes the structural region of hepatitis C virus has 
been replaced by the structural region from the genome of another hepatitis C virus 
strain. 

49. The composition according to claim 48, wherein the molecule 
encodes the amino acid sequence of SEQ ID NO:5 shown in Figures 16G-16H. 

-4- 
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50. The composition according to claim 48, wlnerein the molecule 
comprises the nucleic acid sequence of SEQ ID NO:6 shown in Figures 16A-16F. 

51 . A composition comprising a purified and isolated nucleic acid 
molecule suspended in a suitable amount of a pharmaceutically acceptable diluent or 
excipient, said nucleic acid molecule encodes human hepatitis C virus, wherein 
expression of said molecule in transfected cells results in production of virus, wherein a 
fragment of the nucleic acid molecule which encodes at least one HCV protein has been 
replaced by a fragment of the genome of another hepatitis C virus strain which encodes 
the corresponding protein. 

52. The composition of claim 51 , wherein the protein is selected from 
the group consisting of NS3 protease, E1 protein, E2 protein and NS4 protein. 

53. A composition comprising a purified and isolated nucleic acid 
molecule suspended in a suitable amount of a pharmaceutically acceptable diluent or 
excipient, said nucleic acid molecule encodes human hepatitis C virus, wherein 
expression of said molecule in transfected cells results in production of virus, wherein a 
fragment of the molecule encoding all or part of an HCV protein has been deleted and, 
wherein the HCV protein is selected from the group consisting of P7, NS4B and NS5A 
proteins. 

54. The composition according to claims 40 or 48, wherein the nucleic 
acid molecule encodes an HCV protease selected from the group consisting of NS3 
domain protease, NS3-NS4 fusion polypeptide and NS2-NS3 protease. 

55. A method of immunizing an animal against hepatitis C virus 
comprising administration of a composition of claim 42, 48, 51 or 53 in an amount 
effective to induce immunity against hepatitis C virus. 

-5- 
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56. The method according to claim 55, wherein the composition is 
provided prophyiactically. 

57. The method according to claim 55, wherein the composition is 
provided to an animal infected with a hepatitis C virus. ~ 

REMARKS 

A restriction requirement was placed on the claims in the parent 
application Serial No. 09/014,416. Applicants are pursuing herein the claims of Group 
VII, claims 42 and 43, in the present divisional application. 

New claims 44-57 have been added, which find support from the 
specification and original claims. Claims 44-50 are supported by claims 2-8, 
respectively. Claims 51-52 are supported by claims 9 and 10. Claims 53-54 are 
supported by claims 11,12 and 28. Claims 55-57 are supported by claim 43 and at 
page 6, lines 16-30 and page 7, lines 4-5. 

No new matter has been added by the Preliminary Amendment. Entry 
thereof is respectfully requested. 

Applicants have also filed herein a sequence listing in compliance with the 
sequence rules under 37 C.F.R. §1 .821 -§1 .825 (Exhibit A), a computer readable 
sequence listing (Exhibit B) and a statement under 37 C.F.R. §1 .821(f) and §1 .821(g) 
which states that the content of the paper sequence and the computer readable 
sequence listings are identical and that no new matter has been added (Exhibit C). 
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Entry and favorable action by the Examiner is respectfully requested. 



MORGAN & FINNEGAN, L.L.P. 
345 Park Avenue 
New York, New York 1 01 54 
(212) 758-4800 Telephone 
(212) 751-6849 Facsimile 



Respectfully submitted, 



Dated: September 14. 2000 




Wathryn m. Brown 
Reg. No. 34,556 
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Title Of Invention 



Cloned Genomes Of Infectious 
Hepatitis C Viruses And Uses Thereof 

This application claims the benefit of U.S. 
Provisional Application No. 60/053,062 filed July 18, 
1997. 

Field Of Invention 
The present invention relates to molecular 
approaches to the production of nucleic acid sequences 
which comprise the genome of infectious hepatitis C 
viruses. In particular, the invention provides nucleic 
acid sequences which comprise the genomes of infectious 
hepatitis C viruses of genotype la and lb strains. The 
invention therefore relates to the use of these sequences, 
and polypeptides encoded by all or part of these 
sequences, in the development of vaccines and diagnostic 
assays for HCV and in the development of screening assays 
for the identification of antiviral agents for HCV. 

Background Of Invention 
Hepatitis C virus (HCV) has a positive -sense 
single- strand RNA genome and is a member of the virus 
family Fla.viviridae (Choo et al., 1991; Rice, 1996). As 
for all positive -stranded RNA viruses, the genome of HCV 
functions as mRNA from which all viral proteins necessary 
for propagation are translated. 

The viral genome of HCV is approximately 9600 
nucleotides (nts) and consists of a highly conserved 5' 
untranslated region (UTR) , a single long open reading 
frame (ORF) of approximately 9,000 nts and a complex 3' 
UTR. The 5' UTR contains an internal ribosomal entry site 
(Tsukiyama-Kohara et al . , 1992; Honda et al . , 1996). The 
3' UTR consists of a short variable region, a 
polypyrimidine tract of variable length and, at the 3' 
end, a highly conserved region of approximately 100 nts 
(Kolykhalov et al., 1996; Tanaka et al . , 1995; Tanaka et 
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al., 1996; Yamada et al., 1996). The last 46 nucleotides 
of this conserved region were predicted to form a stable 
stem-loop structure thought to be critical for viral 
replication (Blight and Rice, 1997; Ito and Lai, 1997; 
Tsuchihara et al., 1997). The ORF encodes a large 
polypeptide precursor that is cleaved into at least 10 
proteins by host and viral proteinases {Rice, 1996) . The 
predicted envelope proteins contain several conserved N- 
linked glycosylation sites and cysteine residues (Okamoto 
et al., 1992a). The NS3 gene encodes a stiine protease 
and an RNA helicase and the NS5B gene encodes an RNA- 
dependent RNA polymerase. 

Globally, six major HCV genotypes (genotypes 1- 
6) and multiple subtypes (a, b, c, etc.) have been 
identified (Bukh et al . , 1993; Simmonds et al . , 1993). 
The most divergent HCV isolates differ from each other by 
more than 30% over the entire genome (Okamoto et al., 
1992a) and HCV circulates in an infected individual as a 
quasispecies of closely related genomes (Bukh et al., 
1995; Farci et al . , 1997). 

At present, more than 80% of individuals 
infected with HCV become chronically infected and these 
chronically infected individuals have a relatively high 
risk of developing chronic hepatitis, liver cirrhosis and 
hepatocellular carcinoma (Hoofnagle, 1997), In the U.S., 
HCV genotypes la and lb constitute the majority of 
infections while in many other areas, especially in Europe 
and Japan, genotype lb predominates. 

The only effective therapy for chronic hepatitis 
C, interferon (IFN) , induces a sustained response in less 
than 25% of treated patients (Fried and Hoofnagle, 1995) . 
Consequently, HCV is currently the most common cause of 
end stage liver failure and the reason for about 30% of 
liver transplants performed in the U.S. (Hoofnagle, 1997) . 
In addition, a number of recent studies suggested that the 
severity of liver disease and the outcome of therapy may 
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be genotype -dependent (reviewed in Bukh et al., 1997) . In 
particular, these studies suggested that infection with 
HCV genotype lb was associated with more severe liver 
disease (Brechot, 1997) and a poorer response to IFN 
therapy (Fried and Hoofnagle, 1995) . As a result of the 
inability to develop a universally effective therapy 
against HCV infection, it is estimated that there are 
still more than 25,000 new infections yearly in the U.S. 
(Alter 1997) Moreover, since there is no vaccine for HCV, 
HCV remains a serious public health problem. 

However, despite the intense interest in the 
development of vaccines and therapies for HCV, progress 
has been hindered by the absence of a useful cell culture 
system and the lack of any small animal model for 
laboratory study. For example, while replication of HCV 
in several cell lines has been reported, such observations 
have turned out not to be highly reproducible. In 
addition, the chimpanzee is the only animal model, other 
than man, for this disease. Consequently, HCV has been 
able to be studied only by using clinical materials 
obtained from patients or experimentally infected 
chimpanzees (an animal model whose availability is very 
limited) . 

However, several researchers have recently 
reported the construction of infectious cDNA clones of 
HCV, the identification of which would permit a more 
effective search for susceptible cell lines and facilitate 
molecular analysis of the viral genes and their function. 
For example. Dash et al . , (1997) and Yoo et al . , (1995) 
reported that RNA transcripts from cDNA clones of HCV-1 

(genotype la) and HCV-N (genotype lb) , respectively, 
resulted in viral replication after transfection into 
human hepatoma cell lines. Unfortunately, the viability 
of these clones was not tested in vivo and concerns were 
raised about the infectivity of these cDNA clones in vitro 

(Fausto, 1997) . In addition, both clones did not contain 
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the terminal 98 conserved nucleotides at the very 3' end 
of the UTR. 

Kolykhalov et al., (1997) and Yanagi et al . 
(1997) reported the derivation from HCV strain H77 (which 
is genotype la) of cDNA clones of HCV that are infectious 
for chimpanzees. However, while these infectious clones 
will aid in studying HCV replication and pathogenesis and 
will provide an important tool for development of in vitro 
replication and propagation systems, it is important to 
have infectious clones of more than one genotype given the 
extensive genetic heterogeneity of HCV and the potential 
impact of such heterogeneity on the development of 
effective therapies and vaccines for HCV. 

Summary Of The Invention 

The present invention relates to nucleic acid 
sequences which comprise the genome of infectious 
hepatitis C viruses and in particular, nucleic acid 
sequences which comprise the genome of infectious 
hepatitis C viruses of genotype la and lb strains. It is 
therefore an object of the invention to provide nucleic 
acid sequences which encode infectious hepatitis C 
viruses . Such nucleic acid sequences are referred to 
throughout the application as "infectious nucleic acid 
sequences" . 

For the purposes of this application, nucleic 
acid sequence refers to RNA, DNA, cDNA or any variant 
thereof capable of directing host organism synthesis of a 
hepatitis C virus polypeptide. It is understood that 
nucleic acid sequence encompasses nucleic acid sequences, 
which due to degeneracy, encode the same polypeptide 
sequence as the nucleic acid sequences described herein. 

The invention also relates to the use of the 
infectious nucleic acid sequences to produce chimeric 
genomes consisting of portions of the open reading frames 
of infectious nucleic acid sequences of other genotypes 
(including, but not limited to, genotypes 1, 2, 3, 4, 5 
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and 6) and subtypes (including, but not limited to, 
subtypes la, lb, 2a, 2b, 2c, 3a 4a-4f, 5a and 6a) of HCV. 
For example infectious nucleic acid sequence of the la and 
lb strains H77 and HC-J4, respectively, described herein 
can be used to produce chimeras with sequences from the 
genomes of other strains of HCV from different genotypes 
or subtypes. Nucleic acid sequences which comprise 
sequence from the open-reading frames of 2 or more HCV 
genotypes or subtypes are designated "chimeric nucleic 
acid sequences" . 

The invention further relates to mutations of 
the infectious nucleic acid sequences of the invention 
where mutation includes, but is not limited to, point 
mutations, deletions and insertions. In one embodiment, a 
gene or fragment thereof can be deleted to determine the 
effect of the deleted gene or genes on the properties of 
the encoded virus such as its virulence and its ability to 
replicate. In an alternative embodiment, a mutation may 
be introduced into the infectious nucleic acid sequences 
to examine the effect of the mutation on the properties of 
the virus in the host cell . 

The invention also relates to the introduction 
of mutations or deletions into the infectious nucleic acid 
sequences in order to produce an attenuated hepatitis C 
virus suitable for vaccine development . 

The invention further relates to the use of the 
infectious nucleic acid sequences to produce attenuated 
viruses via passage in vitro or in vivo of the viruses 
produced by transfection of a host cell with the 
infectious nucleic acid sequence. 

The present invention also relates to the use of 
the nucleic acid sequences of the invention or fragments 
thereof in the production of polypeptides where "nucleic 
acid sequences of the invention" refers to infectious 
nucleic acid sequences, mutations of infectious nucleic 
acid sequences, chimeric nucleic acid sequences and 
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sequences which comprise the genome of attenuated viruses 
produced from the infectious nucleic acid sequences of the 
invention. The polypeptides of the invention, especially 
structural polypeptides, can serve as immunogens in the 
development of vaccines or as antigens in the development 
of diagnostic assays for detecting, the presence of HCV in 
biological samples . 

The invention therefore also relates to vaccines 
for use in immunizing mammals especially humans against 
hepatitis C. In one embodiment, the vaccine comprises one 
or more polypeptides made from a nucleic acid sequence of 
the invention or fragment thereof . In a second 
embodiment, the vaccine comprises a hepatitis C virus 
produced by transfection of host cells with the nucleic 
acid sequences of the invention. 

The present invention therefore relates to 
methods for preventing hepatitis C in a mammal . In one 
embodiment the method comprises administering to a mammal 
a polypeptide or polypeptides encoded by a nucleic acid 
sequence of the invention in an amount effective to induce 
protective immunity to hepatitis C. In another 
embodiment, the method of prevention comprises 
administering to a mammal a hepatitis C virus of the 
invention in an amount effective to induce protective 
immunity against hepatitis C. 

In yet another embodiment, the method of 
protection comprises administering to a mammal a nucleic 
acid sequence of the invention or a fragment thereof in an 
amount effective to induce protective immunity against 
hepatitis C. 

The invention also relates to hepatitis C 
viruses produced by host cells transfected with the 
nucleic acid sequences of the present invention. 

The invention therefore also provides 
pharmaceutical compositions comprising the nucleic acid 
sequences of the invention and/or their encoded hepatitis 
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c viruses. The invention further provides pharmaceutical 
compositions comprising polypeptides encoded by the 
nucleic acid sequences of the invention or fragments 
thereof . The pharmaceutical compositions of the invention 
may be used prophylactically or therapeutically. 

The invention also relates to antibodies to the 
hepatitis C viruses of the invention or their encoded 
polypeptides and to pharmaceutical compositions comprising 
these antibodies . 

The present invention further relates to 
polypeptides encoded by the nucleic acid sequences of the 
invention fragments thereof. In one embodiment, said 
polypeptide or polypeptides are fully or partially 
purified from hepatitis C virus produced by cells 
transfected with nucleic acid sequence of the invention. 
In another embodiment, the polypeptide or polypeptides are 
produced recombinantly from a fragment of the nucleic acid 
sequences of the invention. In yet another embodiment, 
the polypeptides are chemically synthesized. 

The invention also relates to the use of the 
nucleic acid sequences of the invention to identify cell 
lines capable of supporting the replication of HCV in 
vitro. 

The invention further relates to the use of the 
nucleic acid sequences of the invention or their encoded 
proteases {e.g. NS3 protease) to develop screening assays 
to identify antiviral agents for HCV. 

Brief Descritation Of Ficrures 

Figure 1 shows a strategy for constructing full- 
length cDNA clones of HCV strain H77. The long PCR 
products amplified with HI and H9417R primers were cloned 
directly into pGEM-9zf (-) after digestion with Not I and 
Xba I (pH21^ and pH50;) . Next, the 3' UTR was cloned into 
both pH21/ and pH50; after digestion with Afl II and Xba I 
(pH21 and pH50) . pH21 was tested for infectivity in a 
chimpanzee. To improve the efficiency of cloning, we 
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constructed a cassette vector with consensus 5' and 3' 
termini of H77 . This cassette vector (pCV) was obtained 
by cutting out the BamHI fragment (nts 1358- 7530 of the 
H77 genome) from pHSO, followed by religation. Finally, 
the long PGR products of H77 amplified with primers Hi and 
H9417R (H product) or primers Al and H9417R (A product) 
were cloned into pCV after digestion with Age I and Afl II 
or with Pin AI and Bfr I. The latter procedure yielded 
multiple complete cDNA clones of strain H77 of HCV. 

Figure 2 shows the results of gel 
electrophoresis of long RT-PCR amplicons of the entire ORE 
of H77 and the transcription mixture of the infectious 
clone of H77. The complete ORF was amplified by long RT- 
PCR with the primers HI or Al and H9417R from 10^ GE of 
H77 . A total of 10 fig of the consensus chimeric clone 

(pCV-H77C) linearized with Xba I was transcribed in a 100 
fil reaction with T7 RNA polymerase. Five fil of the 
transcription mixture was analyzed by gel electrophoresis 
and the remainder of the mixture was injected into a 
chimpanzee. Lane 1, molecular weight marker ; lane 2, 
products amplified with primers Hi and H9417R; lane 3, 
products amplified with primers Al and H9417R; lane 4, 
transcription mixture containing the RNA transcripts and 
linearized clone pCV-H77C (12.5 kb) . 

Figure 3 is a diagram of the genome organization 
of HCV strain H77 and the genetic heterogeneity of 
individual full-length clones compared with the consensus 
sequence of H77. Solid lines represent aa changes. 
Dashed lines represent silent mutations. A * in pH2l 
represents a point mutation at nt 58 in the 5' UTR. In 
the ORF, the consensus chimeric clone pCV-H77C had il nt 
differences [at positions 1625 (C-*T) , 2709 (T^C) , 3380 

(A-^) , 3710 (C-*T) , 3914 (G^A) , 4463 (T-^C) , 5058 (C-*T) , 
5834 (C-*T) , 6734 (T^C) , 7154 (C-*T) , and 7202 (T-»C) ] and 
one aa change (F -» L at aa 790) compared with the 
consensus sequence of H77. This clone was infectious. 
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Clone pH21 and pCV-Hll had 19 nts (7 aa) and 64 nts (21 
aa) differences respectively, compared with the consensus 
sequence of H77. These two clones were not infectious, A 
single point mutation in the 3' UTR at nucleotide 9406 
(G-»A) introduced to create an Afl II cleavage site is not 
shown . 

Figures 4A-4F show the complete nucleotide 
sequence of a H77C clone produced according to the present 
invention and Figures 4G-4H show the amino acid sequence 
encoded by the H77C clone. 

Figure 5 shows an agarose gel of long RT-PCR 
amplicons and transcription mixtures. Lanes 1 and 4: 
Molecular weight marker (Lambda/Hindlll digest) . Lanes 2 
and 3: RT-PCR amplicons of the entire ORF of HC- J4 . Lane 
5: pCV-H77C transcription control (Yanagi et al., 1997). 
Lanes 6, 7, and 8: 1/40 of each transcription mixture of 
pCV-J4L2S, pCV-J4L4S and pCV-J4L6S, respectively, which 
was inj ected into the chimpanzee . 

Figure 6 shows the strategy utilized for the 
construction of full-length cDNA clones of HCV strain HC- 
J4 . The long PCR products were cloned as two separate 
fragments (L and S) into a cassette vector (pCV) with 
fixed 5' and 3' termini of HCV (Yanagi et al . , 1997). 
Full-length cDNA clones of HC-J4 were obtained by 
inserting the L fragment from three pCV-J4L clones into 
three identical pCV-J4S9 clones after digestion with 
PinAI (isoschizomer of Agel) and Bfrl (isoschizomer of 
Afill) . 

Figure 7 shows amino acid positions with a 
quasispecies of HC-J4 in the acute phase plasma pool 
obtained from an experimentally infected chimpanzee . 
Cons-p9: consensus amino acid sequence deduced from 
analysis of nine L fragments and nine S fragments (see 
Fig. 6) . Cons-D: consensus sequence derived from direct 
sequencing of the PCR product. A, B, C: groups of similar 
viral species. Dot: amino acid identical to that in Cons- 
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p9. Capital letter: amino acid different from that in 
Cons-p9. Cons-F: composite consensus amino acid sequence 
combining Cons-p9 and Cons-D. Boxed amino acid: different 
from that in Cons-F. Shaded amino acid: different from 
that in all species A sequences. An *: defective ORF due 
to a nucleotide deletion (clone LI, aa 1097) or insertion 

(clone L7, aa 2770). Diagonal lines: fragments used to 
construct the infectious clone. 

Figure 8 shows comparisons (percent difference) 
of nucleotide (nts. 156 - 8935) and predicted amino acid 
sequences (aa 1 - 2864) of L clones (species A, B, and C, 
this study), HC-J4/91 (Okamoto et al . , 1992b) and HC-J4/83 

(Okamoto et al . , 1992b). Differences among species A 
sequences and among species B sequences are shaded. 

Figure 9 shows UPGMA ("unweighted pair group 
method with arithmetic mean") trees of HC-J4/91 (Okamoto 
et al., 1992b), HC-J4/83 (Okamoto et al . , 1992b), two 
prototype strains of genotype lb (HCV-J, Kato et al . , 
1990; HCV-BK, Takamizawa et al., 1991), and L clones (this 
study) . 

Figure 10 shows the alignment of the HVRl and 
HVR2 amino acid sequences of the E2 sequences of nine L 
clones of HC-J4 (species A, B, and C) obtained from an 
early acute phase plasma pool of an experimentally 
infected chimpanzee compared with the sequences of eight 
clones (HC-J4/91-20 through HC-J4/91-27, Okamoto et al . , 
1992b) derived from the inoculum. Dot: an amino acid 
identical to that in the top line. Capital letters: amino 
acid different from that in the top line. 

Figure 11 shows the alignment of the 5' UTR and 
the 3' UTR sequences of infectious clones of genotype la 
(pCV-H77C) and lb (pCV-J4L6S) . Top line: consensus 
sequence of the indicated strain. Dot: identity with 
consensus sequence. Capital letter: different from the 
consensus sequence. Dash: deletion. Underlined: PinAl 
and Bfrl cleavage site. Numbering corresponds to the HCV 



344936_1 



- 11 - 



sequence of pCV-J4Li6S. 

Figure 12 shows a comparison of individual full- 
length cDNA clones of the ORF of HCV strain HC-J4 with 
the consensus sequence (see Fig, 7). Solid lines: amino 
acid changes. Dashed lines: silent mutations. Clone pCV- 
J4L6S was infectious in vivo whereas clones pCV-J4L2S and 
pCV-J4L4S were not. 

Figure 13 shows biochemical (ALT levels) and PGR 
analyses of a chimpanzee following percutaneous 
intrahepatic transfection with RNA transcripts of the 
infectious clone of pCV-J4L2S, pCV-J4L4S and pCV-J4L6S. 
The ALT serum enzyme levels were measured in units per 
liter (u/1) . For the PGR analysis, "HGV RNA" represented 
by an open rectangle indicates a serum sample that was 
negative for HCV after nested PGR; "HGV RNA" represented 
by a closed rectangle indicates that the serum sample was 
positive for HCV and HCV GE titer on the right-hand y-axis 
represents genome equivalents. 

Figures 14A-14F show the nucleotide sequence of 
the infectious clone of genotype lb strain HC-J4 and 
Figures 14G-14H show the amino acid sequence encoded by 
the HG-J4 clone. 

Figure 15 shows the strategy for constructing a 
chimeric HCV clone designated pH77CV-J4 which contains the 
nonstructural region of the infectious clone of genotype 
la strain H77 and the structural region of the infectious 
clone of genotype lb strain HG-J4. 

Figures 16A-16F show the nucleotide sequence of 
the chimeric la/lb clone pH77CV-J4 of Figure 15 and 
Figures 16G-16H show the amino acid sequence encoded by 
the chimeric la/lb clone. 

Figures 17A and 17B show the sequence of the 3' 
untranslated region remaining in various 3' deletion 
mutants of the la infectious clone pCV-H77C and the 
strategy utilized in constructing each 3' deletion mutant 
(Figures 17C-17G) . 
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Of the seven deletion mutants shown, two (pCV- 
H77C(-98X) and {-42X)) have been constructed and tested 
for infectivity in chimpanzees (see Figures 17A and 17C) 
and the other six are to be constructed and tested for 
infectivity as described in Figures 17D-17G. 

Figures 18A and 18B show biochemical (ALT 
levels) , PGR (HCV RNA and HCV GE titer) , serological 
(anti-HCV) and histopathological (Fig. 18B only) analyses 
of chimpanzees 1494 (Fig. 18A) and 153 0 (Fig. 18B) 
following transfection with the infectious cDNA clone pCV- 
H77C. 

The ALT serum enzyme levels were measured in 
units per ml (u/l) . For the PGR analysis, "HCV RNA" 
represented by an open rectangle indicates a serum sample 
that was negative for HCV after nested PGR; "HGV RNA" 
represented by a closed rectangle indicates that the serum 
sample was positive for HCV; and HCV GE titer on the 
right-hand y-axis represents genome equivalents. 

The bar marked "anti-HCV" indicates samples that 
were positive for anti-HCV antibodies as determined by 
commercial assays. The histopathology scores in Figure 
IBB correspond to no histopathology (O) , mild hepatitis 
(0) and moderate to severe hepatitis (•) . 

DESCRIPTION OF THE INVENTION 
The present invention relates to nucleic acid 
sequences which comprise the genome of an infectious 
hepatitis C virus. More specifically, the invention 
relates to nucleic acid sequences which encode infectious 
hepatitis C viruses of genotype la and lb strains. In one 
embodiment, the infectious nucleic acid sequence of the 
invention has the sequence shown in Figures 4A-4F of this 
application- In another embodiment, the infectious 
nucleic acid sequence has the sequence shown in Figures 
14A-14F and is contained in a plasmid construct deposited 
with the American Type Culture Collection (ATCG) on 
January 26, 1998 and having ATCC accession number . 
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The invention also relates to "chimeric nucleic 
acid sequences" where the chimeric nucleic acid sequences 
consist of open- reading frame sequences taken from 
infectious nucleic acid sequences of hepatitis C viruses 
of different genotypes or subtypes. 

In one embodiment, the chimeric nucleic acid 
sequence consists of sequence from the genome of an HCV 
strain belonging to one genotype or s\ibtype which encodes 
structural polypeptides and sequence of an HCV strain 
belonging to another genotype strain or subtype which 
encodes nonstructural polypeptides. Such chimeras can be 
produced by standard techniques of restriction digestion, 
PGR amplification and siibcloning known to those of 
ordinary skill in the art. 

In a preferred embodiment, the sequence encoding 
nonstructural polypeptides is from an infectious nucleic 
acid sequence encoding a genotype la strain where the 
construction of a chimeric la/lb nucleic acid sequence is 
described in Example 9 and the chimeric la/lb nucleic acid 
sequence is shown in Figures 16A-16F. It is believed that 
the construction of such chimeric nucleic acid sequences 
will be of importance in studying the growth and virulence 
properties of hepatitis C virus and in the production of 
hepatitis C viruses suitable to confer protection against 
multiple genotypes of HCV. For example, one might produce 
a "multivalent" vaccine by putting epitopes from several 
genotypes or subtypes into one clone. Alternatively one 
might replace just a single gene from an infectious 
sequence with the corresponding gene from the genomic 
sequence of a strain from another genotype or subtype or 
create a chimeric gene which contains portions of a gene 
from two genotypes or subtypes. Examples of genes which 
could be replaced or which could be made chimeric, 
include, but are not limited to, the El, E2 and NS4 genes. 

The invention further relates to mutations of 
the infectious nucleic acid sequences where "mutations" 
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includes, but is not limited to, point mutations, 
deletions and insertions. Of course, one of ordinary 
skill in the art would recognize that the size of the 
insertions would be limited by the ability of the 
resultant nucleic acid sequence to be properly packaged 
within the virion. Such mutation could be produced by 
techniques known to those of skill in the art such as 
site-directed mutagenesis, fusion PGR, and restriction 
digestion followed by religation. 

In one embodiment, mutagenesis might be 
undertaken to determine sequences that are important for 
viral properties such as replication or virulence. For 
example, one may introduce a mutation into the infectious 
nucleic acid sequence which eliminates the cleavage site 
between the NS4A and NS4B polypeptides to examine the 
effects on viral replication and processing of the 
polypeptide. Alternatively, one or more of the 3 amino 
acids encoded by the infectious lb nucleic acid sequence 
shown in Figures 14A-14F which differ from the HC-J4 
consensus sequence may be back mutated to the 
corresponding amino acid in the HC-J4 consensus sequence 
to determine the importance of these three amino acid 
changes to infectivity or virulence. In yet another 
embodiment, one or more of the amino acids from the 
noninfectious lb clones pCV-J4L2S and pCV-J4L4S which 
differ from the consensus sequence may be introduced into 
the infectious lb sequence shown in Figures 14A-14F. 

In yet another example, one may delete all or 
part of a gene or of the 5' or 3' nontranslated region 
contained in an infectious nucleic acid sequence and then 
transfect a host cell (animal or cell culture) with the 
mutated sequence and measure viral replication in the host 
by methods known in the art such as RT-PCR. Preferred 
genes include, but are not limited to, the P7, NS4B and 
NS5A genes. Of course, those of ordinary skill in the art 
will understand that deletion of part of a gene. 
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preferably the central portion of the gene, may be 
preferable to deletion of the entire gene in order to 
conserve the cleavage site boundaries which exist between 
proteins in the HCV polyprotein and which are necessary 
for proper processing of the polyprotein. 

In the alternative, if the transfection is into 
a host animal such as a chimpanzee, one can monitor the 
virulence phenotype of the virus produced by transfection 
of the mutated infectious nucleic acid sequence by methods 
known in the art such as measurement of liver enzyme 
levels (alanine aminotransferase (ALT) or isocitrate 
dehydrogenase (ICD) ) or by histopathology of liver 
biopsies. Thus, mutations of the infectious nucleic acid 
sequences may be useful in the production of attenuated 
HCV strains suitable for vaccine use. 

The invention also relates to the use of the 
infectious nucleic acid sequences of the present invention 
to produce attenuated viral strains via passage in vitro 
or in vivo of the virus produced by transfection with the 
infectious nucleic acid sequences . 

The present invention therefore relates to the 
use of the nucleic acid sequences of the invention to 
identify cell lines capable of supporting the replication 
of HCV. 

In particular, it is contemplated that the 
mutations of the infectious nucleic acid sequences of the 
invention and the production of chimeric sequences as 
discussed above may be useful in identifying sequences 
critical for cell culture adaptation of HCV and hence, may 
be useful in identifying cell lines capable of supporting 
HCV replication. 

Transfection of tissue culture cells with the 
nucleic acid sequences of the invention may be done by 
methods of transfection known in the art such as 
electroporation, precipitation with DEAE-Dextran or 
calcium phosphate or liposomes. 
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In one such embodiment, the method comprises the 
growing of animal cells, especially human cells, in vitro 
and transfecting the cells with the nucleic acid of the 
invention, then determining if the cells show indicia of 
HCV infection. Such indicia include the detection of 
viral antigens in the cell, for example, by 
immunof luorescent procedures well known in the art; the 
detection of viral polypeptides by Western blotting using 
antibodies specific therefor; and the detection of newly 
transcribed viral RNA within the cells via methods such as 
RT-PCR. The presence of live, infectious virus particles 
following such tests may also be shown by injection of 
cell culture medium or cell lysates into healthy, 
susceptible animals, with subsequent exhibition of the 
symptoms of HCV infection. 

Suitable cells or cell lines for culturing HCV 
include, but are not limited to, lymphocyte and hepatocyte 
cell lines known in the art. 

Alternatively, primary hepatocytes can be 
cultured, and then infected with HCV; or, the hepatocyte 
cultures could be derived from the livers of infected 
chimpanzees. In addition, various immortalization methods 
known to those of ordinary skill in the art can be used to 
obtain cell-lines derived from hepatocyte cultures. For 
example, primairy hepatocyte cultures may be fused to a 
variety of cells to maintain stability. 

The present invention further relates to the in 
vitro and in vivo production of hepatitis C viruses from 
the nucleic acid sequences of the invention. 

In one embodiment, the sequences of the 
invention can be inserted into an expression vector that 
functions in eukaryotic cells. Eukaryotic expression 
vectors suitable for producing high efficiency gene 
transfer in vivo are well known to those of ordinary skill 
in the art and include, but are not limited to, plasmids, 
vaccinia viruses, retroviruses, adenoviruses and adeno- 
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associated viruses - 

In another embodiment, the sequences contained 
in the recombinant expression vector can be transcribed in 
vitro by methods known to those of ordinary skill in the 
art in order to produce RNA transcripts which encode the 
hepatitis C viruses of the invention. The hepatitis C 
viruses of the invention may then be produced by 
transfecting cells by methods known to those of ordinary 
skill in the art with either the in vitro transcription 
mixture containing the RNA transcripts (see Example 4) or 
with the recombinant expression vectors containing the 
nucleic acid sequences described herein. 

The present invention also relates to the 
construction of cassette vectors useful in the cloning of 
viral genomes wherein said vectors comprise a nucleic acid 
sequence to be cloned, and said vector reading in the 
correct phase for the expression of the viral nucleic acid 
to be cloned. Such a cassette vector will, of course, 
also possess a promoter sequence, advantageously placed 
upstream of the sequence to be expressed. Cassette 
vectors according to the present invention are constructed 
according to the procedure described in Figure 1, for 
example, starting with plasmid pCV. Of course, the DMA to 
be inserted into said cassette vector can be derived from 
any virus, advantageously from HCV, and most 
advantageously from the H77 strain of HCV. The nucleic 
acid to be inserted according to the present invention 
can, for example, contain one or more open reading frames 
of the virus, for example, HCV. The cassette vectors of 
the present invention may also contain, optionally, one or 
more expressible marker genes for expression as an 
indication of successful transfection and expression of 
the nucleic acid sequences of the vector. To insure 
expression, the cassette vectors of the present invention 
will contain a promoter sequence for binding of the 
appropriate cellular RNA polymerase, which will depend on 
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the cell into which the vector has been introduced. For 
example, if the host cell is a bacterial cell, then said 
promoter will be a bacterial promoter sequence to which 
the bacterial RNA polymerases will bind. 

The hepatitis C viruses produced from the 
sequences of the invention may be purified or partially 
purified from the transfected cells by methods known to 
those of ordinary skill in the art. In a preferred 
embodiment, the viruses are partially purified prior to 
their use as immunogens in the pharmaceutical compositions 
and vaccines of the present invention. 

The present invention therefore relates to the 
use of the hepatitis C viruses produced from the nucleic 
acid sequences of the invention as immunogens in live or 
killed ( e.g. . formalin inactivated) vaccines to prevent 
hepatitis C in a mammal. 

In an alternative embodiment, the immunogen of 
the present invention may be an infectious nucleic acid 
sequence, a chimeric nucleic acid sequence, or a mutated 
infectious nucleic acid sequence which encodes a hepatitis 
C virus. Where the sequence is a cDNA sequence, the cDNAs 
and their RNA transcripts may be used to transfect a 
mammal by direct injection into the liver tissue of the 
mammal as described in the Examples. 

Alternatively, direct gene transfer may be 
accomplished via administration of a eukaryotic expression 
vector containing a nucleic acid sequence of the 
invention. 

In yet another embodiment, the immunogen may be 
a polypeptide encoded by the nucleic acid sequences of the 
invention. The present invention therefore also relates 
to polypeptides produced from the nucleic acid sequences 
of the invention or fragments thereof. In one embodiment, 
polypeptides of the present invention can be recombinantly 
produced by synthesis from the nucleic acid sequences of 
the invention or isolated fragments thereof, and purified. 
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or partially purified, from transfected cells using 
methods already knovm in the art. In an alternative 
embodiment, the polypeptides may be purified or partially 
purified from viral particles produced via transfection of 
a host cell with the nucleic acid sequences of the 
invention. Such polypeptides might, for example, include 
either capsid or envelope polypeptides prepared from the 
sequences of the present invention. 

when used as immunogens, the nucleic acid 
sequences of the invention, or the polypeptides or viruses 
produced therefrom, are preferably partially purified 
prior to use as immunogens in pharmaceutical compositions 
and vaccines of the present invention. When used as a 
vaccine, the sequences and the polypeptide and virus 
products thereof, can be administered alone or in a 
suitable diluent, including, but not limited to, water, 
saline, or some type of buffered medium. The vaccine 
according to the present invention may be administered to 
an animal, especially a mammal, and most especially a 
human, by a variety of routes, including, but not limited 
to, intradermally, intramuscularly, subcutaneously, or in 
any combination thereof . 

Suitable amounts of material to administer for 
prophylactic and therapeutic purposes will vary depending 
on the route selected and the immunogen (nucleic acid, 
virus, polypeptide) administered. One skilled in the art 
will appreciate that the amounts to be administered for 
any particular treatment protocol can be readily 
determined without undue experimentation. The vaccines of 
the present invention may be administered once or 
periodically until a suitable titer of anti-HCV antibodies 
appear in the blood. For an immunogen consisting of a 
nucleic acid sequence, a suitable amount of nucleic acid 
sequence to be used for prophylactic purposes might be 
expected to fall in the range of from about 100 /xg to 
about 5 mg and most preferably in the range of from about 
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500 /ig to about 2mg. For a polypeptide, a suitable amount 
to use for prophylactic purposes is preferably 100 ng to 
100 ixg and for a virus 10^ to 10^ infectious doses. Such 
administration will, of course, occur prior to any sign of 
HCV infection. 

A vaccine of the present invention may be 
employed in such forms as capsules, liquid solutions, 
suspensions or elixirs for oral administration, or sterile 
liquid forms such as solutions or suspensions. Any inert 
carrier is preferably used, such as saline or phosphate - 
buffered saline, or any such carrier in which the HCV of 
the present invention can be suitably suspended. The 
vaccines may be in the form of single dose preparations or 
in multi-dose flasks which can be utilized for mass- 
vaccination programs of both animals and humans. For 
purposes of using the vaccines of the present invention 
reference is made to Remington' s Pharmaceutical Sciences. 
Mack Publishing Co., Easton, Pa., Osol (Ed.) (1980); and 
New Trends and Developments in Vaccines . Voller et al. 
(Eds.), University Park Press, Baltimore, Md. (1978), both 
of which provide much useful information for preparing and 
using vaccines. Of course, the polypeptides of the 
present invention, when used as vaccines, can include, as 
part of the composition or emulsion, a suitable adjuvant, 
such as alum (or aluminum hydroxide) when humans are to be 
vaccinated, to further stimulate production of antibodies 
by immune cells. When nucleic acids or viruses are used 
for vaccination purposes, other specific adjuvants such as 
CpG motifs (Krieg, A.K. et al.(1995) and (1996)), may 
prove useful . 

When the nucleic acids, viruses and polypeptides 
of the present invention are used as vaccines or inocula, 
they will normally exist as physically discrete units 
suitable as a unitary dosage for animals, especially 
mammals, and most especially humans, wherein each unit 
will contain a predetermined quantity of active material 
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calculated to produce the desired immunogenic effect in 
association with the required diluent. The dose of said 
vaccine or inoculum according to the present invention is 
administered at least once. In order to increase the 
antibody level, a second or booster dose may be 
administered at some time after the initial dose- The 
need for, and timing of, such booster dose will, of 
course, be determined within the sound judgment of the 
administrator of such vaccine or inoculum and according to 
sound principles well known in the art. For example, such 
booster dose could reasonably be expected to be 
advantageous at some time between about 2 weeks to about 6 
months following the initial vaccination. Subsequent 
doses may be administered as indicated. 

The nucleic acid sequences, viruses and 
polypeptides of the present invention can also be 
administered for purposes of therapy, where a mammal, 
especially a primate, and most especially a human, is 
already infected, as shown by well known diagnostic 
measures. When the nucleic acid sequences, viruses or 
polypeptides of the present invention are used for such 
therapeutic purposes, much of the same criteria will apply 
as when it is used as a vaccine, except that inoculation 
will occur post- infection. Thus, when the nucleic acid 
sequences, viruses or polypeptides of the present 
invention are used as therapeutic agents in the treatment 
of infection, the therapeutic agent comprises a 
pharmaceutical composition containing a sufficient amount 
of said nucleic acid sequences, viruses or polypeptides so 
as to elicit a therapeutically effective response in the 
organism to be treated. Of course, the amount of 
pharmaceutical composition to be administered will, as for 
vaccines, vary depending on the iramunogen contained 
therein (nucleic acid, polypeptide, virus) and on the 
route of administration. 

The therapeutic agent according to the present 
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invention can thus be administered by, subcutaneous, 
intramuscular or intradermal routes . One skilled in the 
art will certainly appreciate that the amounts to be 
administered for any particular treatment protocol can be 
readily determined without undue experimentation. Of 
course, the actual amounts will vary depending on the 
route of administration as well as the sex, age, and 
clinical status of the subject which, in the case of human 
patients, is to be determined with the sound judgment of 
the clinician. 

The therapeutic agent of the present invention 
can be employed in such forms as capsules, liquid 
solutions, suspensions or elixirs, or sterile liquid forms 
such as solutions or suspensions. Any inert carrier is 
preferably used, such as saline, phosphate-buf f ered 
saline, or any such carrier in which the HCV of the 
present invention can be suitably suspended. The 
therapeutic agents may be in the forro of single dose 
preparations or in the multi-dose flasks which can be 
utilized for mass -treatment programs of both animals and 
humans. Of course, when the nucleic acid sequences, 
viruses or polypeptides of the present invention are used 
as therapeutic agents they may be administered as a single 
dose or as a series of doses, depending on the situation 
as determined by the person conducting the treatment . 

The nucleic acids, polypeptides and viruses of 
the present invention can also be utilized in the 
production of antibodies against HCV. The term "antibody" 
is herein used to refer to immunoglobulin molecules and 
immunologically active portions of immunoglobulin 
molecules. Examples of antibody molecules are intact 
immunoglobulin molecules, substantially intact 
immunoglobulin molecules and portions of an immunoglobulin 
molecule, including those portions known in the art as 
Fab, F(ab')2 and F (v) as well as chimeric antibody 
molecules . 
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Thus, the polypeptides, viruses and nucleic acid 
sequences of the present invention can be used in the 
generation of antibodies that immunoreact (i.e., specific 
binding between an antigenic determinant -containing 
molecule and a molecule containing an antibody combining 
site such as a whole antibody molecule or an active 
portion thereof) with antigenic determinants on the 
surface of hepatitis C virus particles. 

The present invention therefore also relates to 
antibodies produced following immunization with the 
nucleic acid sequences, viruses or polypeptides of the 
present invention. These antibodies are typically 
produced by immunizing a mammal with an immunogen or 
vaccine to induce antibody molecules having 
immunospecif icity for polypeptides or viruses produced in 
response to infection with the nucleic acid sequences of 
the present invention. When used in generating such 
antibodies, the nucleic acid sequences, viruses, or 
polypeptides of the present invention may be linked to 
some type of carrier molecule. The resulting antibody 
molecules are then collected from said mammal. Antibodies 
produced according to the present invention have the 
unique advantage of being generated in response to 
authentic, functional polypeptides produced according to 
the actual cloned HCV genome. 

The antibody molecules of the present invention 
may be polyclonal or monoclonal . Monoclonal antibodies 
are readily produced by methods well known in the art. 
Portions of immunoglobin molecules, such as Fabs, as well 
as chimeric antibodies, may also be produced by methods 
well known to those of ordinary skill in the art of 
generating "such antibodies. 

The antibodies according to the present 
invention may also be contained in blood plasma, serum, 
hybridoma supernatants , and the like. Alternatively, the 
antibody of the present invention is isolated to the 
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extent desired by well known techniques such as, for 
example, using DEAE Sephadex. The antibodies produced 
according to the present invention may be further purified 
so as to obtain specific classes or subclasses of antibody 
such as IgM, IgG, IgA, and the like. Antibodies of the 
IgG class are preferred for purposes of passive 
protection. 

The antibodies of the present invention are 
useful in the prevention and treatment of diseases caused 
by hepatitis C virus in animals, especially mammals, and 
most especially humans. 

In providing the antibodies of the present 
invention to a recipient mammal, preferably a human, the 
dosage of administered antibodies will vary depending on 
such factors as the mammal's age, weight, height, sex, 
general medical condition, previous medical history, and 
the like. 

In general, it will be advantageous to provide 
the recipient mammal with a dosage of antibodies in the 
range of from about 1 mg/kg body weight to about 10 mg/kg 
body weight of the mammal, although a lower or higher dose 
may be administered if found desirable. Such antibodies 
will normally be administered by intravenous or 
intramuscular route as an inoculum. The antibodies of the 
present invention are intended to be provided to the 
recipient subject in an amount sufficient to prevent, 
lessen or attenuate the severity, extent or duration of 
any existing infection. 

The antibodies prepared by use of the nucleic 
acid sequences, viruses or polypeptides of the present 
invention are also highly useful for diagnostic purposes. 
For example, the antibodies can be used as in vitro 
diagnostic agents to test for the presence of HCV in 
biological samples taken from animals, especially humans. 
Such assays include, but are not limited to, 
radioimmunoassays, EIA, fluorescence. Western blot 
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analysis and ELISAs . In one such embodiment, the 
biological sample is contacted with antibodies of the 
present invention and a labeled second antibody is used to 
detect the presence of HCV to which the antibodies are 
bound . 

Such assays may be, for example, a direct 
protocol (where the labeled first antibody is 
immunoreactive with the antigen, such as, for example, a 
polypeptide on the surface of the virus) , an indirect 
protocol (where a labeled second antibody is reactive with 
the first antibody) , a competitive protocol (such as would 
involve the addition of a labeled antigen) , or a sandwich 
protocol (where both labeled and unlabeled antibody are 
used) , as well as other protocols well known and described 
in the art. 

In one embodiment, an immunoassay method would 
utilize an antibody specific for HCV envelope determinants 
and would further comprise the steps of contacting a 
biological sample with the HCV- specific antibody and then 
detecting the presence of HCV material in the test sample 
using one of the types of assay protocols as described 
above. Polypeptides and antibodies produced according to 
the present invention may also be supplied in the form of 
a kit, either present in vials as purified material, or 
present in compositions and suspended in suitable diluents 
as previously described. 

In a preferred embodiment, such a diagnostic 
test kit for detection of HCV antigens in a test sample 
comprises in combination a series of containers, each 
container a reagent needed for such assay. Thus, one such 
container would contain a specific amount of HCV- specific 
antibody as already described, a second container would 
contain a diluent for suspension of the sample to be 
tested, a third container would contain a positive control 
and an additional container would contain a negative 
control. An additional container could contain a blank. 
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For all prophylactic, therapeutic and diagnostic 
uses, the antibodies of the invention and other reagents, 
plus appropriate devices and accessories, may be provided 
in the form of a kit so as to facilitate ready 
availability and ease of use. 

The present invention also relates to the use of 
nucleic acid sequences and polypeptides of the present 
invention to screen potential antiviral agents for 
antiviral activity against HCV. Such screening methods 
are known by those of skill in the art. Generally, the 
antiviral agents are tested at a variety of 
concentrations, for their effect on preventing viral 
replication in cell culture systems which support viral 
replication, and then for an inhibition of infectivity or 
of viral pathogenicity (and a low level of toxicity) in an 
animal model system. 

In one embodiment, animal cells (especially 
human cells) transfected with the nucleic acid sequences 
of the invention are cultured in vitro and the cells are 
treated with a candidate antiviral agent (a chemical, 
peptide etc.) for antiviral activity by adding the 
candidate agent to the medium. The treated cells are then 
exposed, possibly under transfecting or fusing conditions 
known in the art, to the nucleic acid sequences of the 
present invention. A sufficient period of time would then 
be allowed to pass for infection to occur, following which 
the presence or absence of viral replication would be 
determined versus untreated control cells by methods known 
to those of ordinary skill in the art. Such methods 
include, but are not limited to, the detection of viral 
antigens in the cell, for example, by immunof luorescent 
procedures well known in the art; the detection of viral 
polypeptides by Western blotting using antibodies specific 
therefor; the detection of newly transcribed viral RNA 
within the cells by RT-PCR; and the detection of the 
presence of live, infectious virus particles by injection 
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of cell culture medium or cell lysates into healthy, 
susceptible animals, with subsequent exhibition of the 
symptoms of HCV infection. A comparison of results 
obtained for control cells {treated only with nucleic acid 
sequence) with those obtained for treated cells (nucleic 
acid sequence and antiviral agent) would indicate, the 
degree, if any, of antiviral activity of the candidate 
antiviral agent. Of course, one of ordinary skill in the 
art would readily understand that such cells can be 
treated with the candidate antiviral agent either before 
or after exposure to the nucleic acid sequence of the 
present invention so as to determine what stage, or 
stages, of viral infection and replication said agent is 
effective against. 

In an alternative embodiment, a protease such as 
NS3 protease produced from a nucleic acid sequence of the 
invention may be used to screen for protease inhibitors 
which may act as antiviral agents. The structural and 
nonstructural regions of the HCV genome, including 
nucleotide and amino acid locations, have been determined, 
for example, as depicted in Houghton, M. (1996), Fig. l; 
and Major, M.E. et al . (1997), Table 1. 

Such above-mentioned protease inhibitors may 
take the form of chemical compounds or peptides which 
mimic the known cleavage sites of the protease and may be 
screened using methods known to those of skill in the art 
(Houghton, M. (1996) and Major, M.E. et al . (1997)). For 
example, a substrate may be employed which mimics the 
protease' s natural substrate, but which provides a 
detectable signal (e.g. by fluorimetric or colorimetric 
methods) when cleaved. This substrate is then incubated 
with the protease and the candidate protease inhibitor 
under conditions of suitable pH, temperature etc. to 
detect protease activity. The proteolytic activities of 
the protease in the presence or absence of the candidate 
inhibitor are then determined. 
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In yet another embodiment, a candidate antiviral 
agent (such as a protease inhibitor) may be directly 
assayed in vivo for antiviral activity by administering 
the candidate antiviral agent to a chimpanzee transfected 
with a nucleic acid sequence of the invention and then 
measuring viral replication in vivo via methods such as 
RT-PCR. Of course, the chimpanzee may be treated with the 
candidate agent either before or after transfection with 
the infectious nucleic acid sequence so as to determine 
what stage, or stages, of viral infection and replication 
the agent is effective against. 

The invention also provides that the nucleic 
acid sequences, viruses and polypeptides of the invention 
may be supplied in the form of a kit, alone or in the form 
of a pharmaceutical composition. 

All scientific publication and/or patents cited 
herein are specifically incorporated by reference. The 
following examples illustrate various aspects of the 
invention but are in no way intended to limit the scope 
thereof . 

EXAMPLES 
MATERIALS AND METHODS 
For Examples 1-4 

Collection of Virus 

Hepatitis C virus was collected and used as a 
source for the RNA used in generating the cDNA clones 
according to the present invention. Plasma containing 
strain H77 of HCV was obtained from a patient in the acute 
phase of transfusion-associated non-A, non-B hepatitis 
(Feinstone et al (1981) ) . Strain H77 belongs to genotype 
la of HCV (Ogata et al (1991), Inchauspe et al (1991)). 
The consensus sequence for most of its genome has been 
determined (Kolyakov et al (1996) , Ogata et al (1991) , 
Inchauspe et al (1991) and Farci et al (1996)). 
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RNA Purif ication 

Viral RNA was collected and purified by 
conventional means. In general, total RNA from 10 fxl of 
H77 plasma was extracted with the TRIzol system (GIBCO 
BRL) . The RNA pellet was resuspended in 10 0 /il of 10 mM 
dithiothreitol (DTT) with 5% (vol/.vol) RNasin (20 - 40 
units//xl) (available from Promega) and 10 fil aliquots were 
stored at -80 'C. In subsequent experiments RT-PCR was 
performed on RNA equivalent to 1 /xl of H77 plasma, which 
contained an estimated 10^ genome equivalents (GE) of HCV 
(Yanagi et al (1996) ) . 

Primers used in the RT-PCR process were deduced from 
the genomic sequences of strain H77 according to 
procedures already known in the art (see above) or else 
were determined specifically for use herein. The primers 
generated for this purpose are listed in Table 1. 

Table 1. Oligonucleotides used for PGR amplification of 
strain H77 of HCV 

Designation Sequence (5' 3')* II 



H9261F 

H3'X58R 

H9282F 

H3'X45R 

H9375F 

H3'X-35R 

H9386F 

H3'X-38R 

HI 

Al 

H9417R 



GGCTACAGCGGGGGGAGACATTTATCACAGC 

TCATGCGGCTCACGGACCTTTCACAGCTAG 

GTCCAAGCTTATCACAGCGTGTCTCATGCCCGGCCCCG 

CGTCTCTAGAGGACCTTTCACAGCTAGCCGTGACTAGGG 

TGAAGGTTGGGGTAAACACTCCGGCCTCTTAGGCCATT 

ACATGATCTGCAGAGAGGCCAGTATCAGCACTCTC 

aTrC AAGCTTACGCGTA AACACTCCGGCCTCCTTAAGCCATTCCTG 

CGTCTCTAGACATGATCTGCAGAGAGGCCAGTATCAGCACTCTCTGC 

l - l - i -I TI - l ' i - GCGGCCGC rAAr/tCGACrCACrArAGCCAGCCCCCTGAT- 

GGGGGCGACACTCCACCATG 

ACTGTCTTCACGCAGAAAGCGTCTAGCCAT 

CGTCTCTAGACAGGAAATGGCTTAAGAGGCCGGAGTGTTTACC 



* HCV sequences are shown in plain text, non-HCV-specific sequences are shown m boldface 
and artificial cleavage sites used for cDNA cloning are underlmed. The core sequence of the T7 
promoter in primer HI is shown in italics. 

Primers for long RT-PCR were size-purified. 
cDNA Synthesis 

The RNA was denatured at 65 "C for 2 min, and 
cDNA synthesis was performed in a 20 fil reaction volume 
with Superscript II reverse transcriptase (from GIBCO/BRL) 
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at 42 'C for 1 hour using specific antisense primers as 
described previously (Tellier et al (1996)). The cDNA 
mixture was treated with RNase H and RNase Tl (GIBCO/BRL) 
for 20 min at 37 "C. 

Amplification and Clonin a of the 3' UTR 

The 3' UTR of strain H77- was amplified by PGR 
in two different assays. In both of these nested PGR 
reactions the first round of PGR was performed in a total 
volume of 50 fxl in 1 x buffer, 250 ^mol of each 
deoxynucleoside triphosphate (dNTP; Pharmacia) , 20 pmol 
each of external sense and antisense primers, 1 /*1 of the 
Advantage KlenTaq polymerase mix (from Clontech) and 2 fil 
of the final cDNA reaction mixture. In the second round 
of PGR, 5 fil of the first round PGR mixture was added to 
45 nl of PGR mixture prepared as already described. Each 
round of PGR (35 cycles) , which was performed in a Perkin 
Elmer DNA thermal cycler 480, consisted of denaturation at 
94 °C for 1 min (in 1st cycle 1 min 30 sec), annealing at 
eCC for 1 min and elongation at 68° C for 2 min. In one 
experiment a region from NS5B to the conserved region of 
the 3' UTR was amplified with the external primers H9261F 
and H3'X58R, and the internal primers H9282F and H3'X45R 
(Table 1) . In another experiment, a segment of the 
variable region to the very end of the 3' UTR was 
amplified with the external primers H9375F and H3'X-35R, 
and the internal primers H9386P and H3'X-38R (Table 1, 
Fig. 1) . Amplified products were purified with QIAquick 
PGR purification kit (from QIAGEN) , digested with Hind III 
and Xba I (from Promega) , purified by either gel 
electrophoresis or phenol /chloroform extraction, and then 
cloned into the multiple cloning site of plasmid pGEM- 
9zf(-) (Promega) or pUG19 (Pharmacia). Cloning of cDNA 
into the vector was performed with T4 DNA ligase (Promega) 
by standard procedures. 
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Amplification of Near Full-Lenqth H77 Genomes by Loner PGR 

The reactions were performed in a total volume 
of 50 Ail in 1 X buffer, 250 /timol of each dNTP, 10 pmol 
each of sense and antisense primers, 1 fil of the Advantage 
KlenTaq polymerase mix and 2 fil of the cDNA reaction 
mixture (Tellier et al (1996)) . A single PGR round of 35 
cycles was performed in a Robocycler thermal cycler (from 
Stratagene) , and consisted of denaturation at 99 *C for 3 5 
sec, annealing at 67 *C for 30 sec and elongation at 68 'C 
for 10 min during the first 5 cycles, 11 min during the 
next 10 cycles, 12 min during the following 10 cycles and 
13 min during the last 10 cycles. To amplify the complete 
ORF of HCV by long RT-PCR we used the sense primers HI or 
Al deduced from the 5' UTR and the antisense primer H9417R 
deduced from the variable region of the 3' UTR (Table 1, 
Fig. 1) . 

Construction of Full-Lenqth H77 cDNA Clones 

The long PGR products amplified with HI and 
H9417R primers were cloned directly into pGEM-9zf (-) after 
digestion with Not 1 and Xba I (from Promega) (as per 
Fig. 1) . Two clones were obtained with inserts of the 
expected size, pH21, and pH50,. Next, the chosen 3' UTR 
was cloned into both pH21; and pH50; after digestion with 
Afl II and Xba 1 (New England Biolabs) . DH5a competent 
cells (GIBCO/BRL) were transformed and selected with LB 
agar plates containing 100 /tig/ml ampicillin (from SIGMA) . 
Then the selected colonies were cultured in LB liquid 
containing ampicillin at 30 'C for -18-20 hrs 
( trans formants containing full-length or near full-length 
cDNA of H77 produced a very low yield of plasmid when 
cultured at 37 *C or for more than 24 hrs) . After small 
scale preparation (Wizard Plus Minipreps DNA Purification 
Systems, Promega) each plasmid was retransf ormed to select 
a single clone, and large scale preparation of plasmid DNA 
was performed with a QIAGEN plasmid Maxi kit . 
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Cloning of Long RT-PCR Products Into a Cassette Vector 

To improve the efficiency of cloning, a vector 
with consensus 5' and 3' termini of HCV strain H77 was 
constructed (Fig. 1) . This cassette vector (pCV) was 
obtained by cutting out the BairSil fragment (nts 1358 - 
7530 of the H77 genome) from pH50,, followed by religation. 
Next, the long PCR products of H77 amplified with HI and 
H9417R or Al and H9417R primers were purified (Geneclean 
spin kit; BIO 101) and cloned into pCV after digestion 
with Age 1 and Afl II (New England Biolabs) or with Pin AI 
(isoschizomer of Age I) and Bfr I (isoschizomer of Afl II) 
(Boehringer Mannheim) . Large scale preparations of the 
plasmids containing full-length cDNA of H77 were performed 
as described above. 

Construction of H77 Consensus Chimeric cDNA Clone 

A full-length cDNA clone of H77 with an ORF 
encoding the consensus amino acid sequence was constructed 
by making a chimera from four of the cDNA clones obtained 
above. This consensus chimera, pCV-H77C, was constiructed 
in two ligation steps by using standard molecular 
procedures and convenient cleavage sites and involved 
first a two piece ligation and then a three piece 
ligation. Large scale preparation of pCV-H77C was 
performed as already described. 
In Vitro Transcription 

Plasmids containing the full-length HCV cDNA 
were linearized with Xba I (from Promega) , and purified by 
phenol /chloroform extraction and ethanol precipitation. A 
100 |il reaction mixture containing 10 fig of linearized 
plasmid DNA, 1 x transcription buffer, 1 mM ATP, CTP, GTP 
and UTP, lOmM DTT , 4% (v/v) RNasin (20-40 unxts/fil) and 2 
fil of T7 RNA polymerase (Promega) was incubated at 37 *C 
for 2 hrs. Five fil of the reaction mixture was analyzed 
by agarose gel electrophoresis followed by ethidium 
bromide staining. The transcription reaction mixture was 
diluted with 4 00 fil of ice-cold phosphate-buf f ered saline 
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without calcium or magnesium, immediately frozen on dry 
ice and stored at -80 *C. The final nucleic acid mixture 
was injected into chimpanzees within 24 hrs. 
Intrahepatic Transfection of Chimpanzees 

Laparotomy was performed and aliquot s from two 
transcription reactions were injected into 6 sites of the 
exposed liver (Emerson et al (1992) . Serum samples were 
collected weekly from chimpanzees and monitored for liver 
enzyme levels and anti-HCV antibodies. Weekly samples of 
100 fil of serum were tested for HCV RNA in a highly 
sensitive nested RT-PCR assay with AmpliTaq Gold (Perkin 
Elmer) (Yanagi et al (1996) ; Bukh et al (1992) ) . The 
genome titer of HCV was estimated by testing 10-fold 
serial dilutions of the extracted RNA in the RT-PCR assay 
(Yanagi et al (1996)) . The two chimpanzees used in this 
study were maintained under conditions that met all 
requirements for their use in an approved facility. 

The consensus sequence of the complete ORF from 
HCV genomes recovered at week 2 post inoculation (p.i) 
was determined by direct sequencing of PGR products 
obtained in long RT-PCR with primers Al and H9417R 
followed by nested PGR of 10 overlapping fragments. The 
consensus sequence of the variable region of the 3 ' UTR 
was determined by direct sequencing of an amplicon 
obtained in nested RT-PCR as described above. Finally, we 
amplified selected regions independently by nested RT-PCR 
with AmpliTaq Gold. 
Sequence Analysis 

Both strands of DNA from PGR products, as well 
as plasmids, were sequenced with the ABI PRISM Dye 
Termination Cycle Sequencing Ready Reaction Kit using Taq 
DNA polymerase (Perkin Elmer) and about 100 specific sense 
and antisense sequence primers. 

The consensus sequence of HCV strain H77 was 
determined in two different ways. In one approach, 
overlapping PGR products were directly sequenced, and 
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amplified in nested RT-PCR from the H77 plasma sample. 
The sequence analyzed (nucleotides (nts) 35-9417) included 
the entire genome except the very 5' and 3' termini. In 
the second approach, the consensus sequence of nts 157- 
9384 was deduced from the sequences of 18 full-length cDNA 
clones. 

EXAMPLE 1 

Variability in the sequence of the 3 ' UTR of HCV strain 
H77 

The heterogeneity of the 3 ' UTR was analyzed by 
cloning and sequencing of DNA amplicons obtained in nested 
RT-PCR. 19 clones containing sequences of the entire 
variable region, the poly U-UC region and the adjacent 19 
nt of the consenrved region, and 65 clones containing 
sequences of the entire poly U-UC region and the first 63 
nts of the conserved region were analyzed. This analysis 
confirmed that the variable region consisted of 43 nts, 
including two conserved termination codons (Han et al 
(1992) ) . The sequence of the variable region was highly 
conserved within H77 since only 3 point mutations were 
found among the 19 clones analyzed. A poly U-UC region 
was present in all 84 clones analyzed. However, its 
length varied from 71-141 nts. The length of the poly U 
region was 9-103 nts, and that of the poly UC region was 
35-85 nts. The number of C residues increased towards the 
3' end of the poly UC region but the sequence of this 
region is not conserved. The first 63 nts of the 
conserved region were highly conserved among the clones 
analyzed, with a total of only 14 point mutations. To 
confirm the validity of the analysis, the 3' UTR was 
reamplified directly from a full-length cDNA clone of HCV 
(see below) by the nested- PGR procedure with the primers 
in the variable region and at the very 3' end of the HCV 
genome and cloned the PGR product. Eight clones had 1-7 
nt deletions in the poly U region. Furthermore, although 
the C residues of the poly UC region were maintained, the 
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spacing of these varied because of 1-2 nt deletions of U 
residues. These deletions must be artifacts introduced by 
PGR and such mistakes may have contributed to the 
heterogeneity originally observed in this region. 
However, the conserved region of the 3' UTR was amplified 
correctly, suggesting that the deletions were due to 
difficulties in transcribing a highly repetitive sequence. 

One of the 3 ' UTR clones was selected for 
engineering of full-length cDNA clones of H77. This clone 
had the consensus variable sequence except for a single 
point mutation introduced to create an Afl II cleavage 
site, a poly U-UC stretch of 81 nts with the most commonly 
observed UC pattern and the consensus sequence of the 
complete conserved region of 101 nts, including the distal 
3 8 nts which originated from the antisense primer used in 
the amplification. After linearization with Xba I, the 
DNA template of this clone had the authentic 3' end. 

EXAMPLE 2 

The Entire Open Reading Frame of H77 Amplified in One 
Round of Long RT-PCR 
It had been previously demonstrated that a 9.25 
kb fragment of the HCV genome from the 5 ' UTR to the 3 ' 
end of NS5B could be amplified from iC* GE (genome 
equivalents) of H77 by a single round of long RT-PCR 
(Tellier et al (1996a) ) . In the current study, by 
optimizing primers and cycling conditions, the entire ORF 
of H77 was amplified in a single round of long RT-PCR with 
primers from the 5' UTR and the variable region of the 3' 
UTR. In fact, 9.4 kb of the H77 genome (H product: from 
the very 5' end to the variable region of the 3' UTR) 
could be amplified from 10^ GE or 9.3 kb (A product: from 
within the 5 ' UTR to the variable region of the 3 ' UTR) 
from 10"* GE or 10^ GE, in a single round of long RT-PCR 
(Fig. 2) . The PCR products amplified from lO^GEof H77 
were used for engineering full-length cDNA clones (see 
below) . 
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Construction of Multiple Full -Length 
cDNA Clones of H77 in a Single Step by- 
Cloning of Long RT-PCR Amplicons Directly 
into a Cassette Vector with Fixed 5' and 3' Termini 

Direct cloning of the long PCR products (H) , 
which contained a 5' T7 promoter, the authentic 5' end, 
the entire ORF of H77 and a short region of the 3' UTR, 
into pGEM-9zf (-) vector by Not I and Xba I digestion was 
first attempted. However, among the 70 clones examined 
all but two had inserts that were shorter than predicted. 
Sequence analysis identified a second Not I site in the 
majority of clones, which resulted in deletion of the nts 
past position 9221. Only two clones (pH21/ and pH50,) were 
missing the second Not I site and had the expected 5 ' and 
3' sequences of the PCR product. Therefore, full-length 
cDNA clones (pH21 and pH50) were constructed by inserting 
the chosen 3' UTR into pH21/ and pH50/, respectively. 
Sequence analysis revealed that clone pH21 had a complete 
full-length secjuence of H77; this clone was tested for 
infect ivity. The second clone, pH50, had one nt deletion 
in the ORF at position 6365; this clone was used to make a 
cassette vector. 

The complete ORF was amplified by constructing a 
cassette vector with fixed 5' and 3' termini as an 
intermediate of the full-length cDNA clones. This vector 
(pCV) was constructed by digestion of clone pH50 with 
BamHI, followed by religation, to give a shortened plasmid 
readily distinguished from plasmids containing the full- 
length insert. Attempts to clone long RT-PCR products (H) 
into pCV by Age I and Afl II yielded only 1 of 23 clones 
with an insert of the expected size. In order to increase 
the efficiency of cloning, we repeated the procedure but 
used Pin A I and Bfr I instead of the respective 
isoschizomers Age I and Afl II. By this protocol, 24 of 
31 H clones and 30 of 35 A clones had the full-length cDNA 
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of H77 as evaluated by restriction enzyme digestion. A 
total of 16 clones, selected at random, were each 
re trans formed, and individual plasmids were purified and 
completely sequenced. 

EXAMPLE 4 

Demonstration of Infectious Nature of 
Transcripts of a cDNA Clone Representing 
the Consensus Sequence of Strain H77 

A consensus chimera was constructed from 4 of 
the full-length cDNA clones with just 2 ligation steps. 
The final construct, pCV-H77C, had 11 nt differences from 
the consensus sequence of H77 in the ORF (Fig. 3) . 
However, 10 of these nucleotide differences represented 
silent mutations. The chimeric clone differed from the 
consensus sequence at only one amino acid [L instead of F 
at position 790] . Among the 18 ORFs analyzed above, the F 
residue was found in 11 clones and the L residue in 7 
clones. However, the L residue was dominant in other 
isolates of genotype la, including a first passage of H77 
in a chimpanzee (Inchauspe et al (1991)). 

To test the infectivity of the consensus 
chimeric clone of H77 intrahepatic transfection of a 
chimpanzee was performed. The pCV-H77C clone was 
linearized with Xba I and transcribed in vitro by T7 RNA 
polymerase {Fig. 2) . The transcription mixture was next 
injected into 6 sites of the liver of chimpanzee 1530. 
The chimpanzee became infected with HCV as measured by 
detection of 10^ GE/ml of viral genome at week i p.i. 
Furthermore, the HCV titer increased to 10'' GE/ml at week 2 
p.i., and reached 10* GE/ml by week 8 p.i. The viremic 
pattern observed in the early phase of the infection with 
the recombinant virus was similar to that observed in 
chimpanzees inoculated intravenously with strain H77 or 
other strains of HCV (Shimizu (1990) ) . 

The sequence of the HCV genomes from the serum 
sample collected at week 2 p.i. was analyzed. The 
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consensus sequence of nts 298-9375 of the recovered 
genomes was determined by direct sequencing of PGR 
products obtained in long RT-PCR followed by nested PGR of 
10 overlapping fragments. The identity to clone pCV-H77C 
sequence was 100%. The consensus sequence of nts 96- 
291,1328-1848, 3585-4106, 4763-5113 and 9322-9445 was 
determined from PGR products obtained in different nested 
RT-PGR assays. The identity of these sequences with pCV- 
H77C was also 100%. These latter regions contained 4 
mutations unique to the consensus chimera, including the 
artificial Afl II cleavage site in the 3' UTR. Therefore, 
RNA transcripts of this clone of HCV were infectious. 

The infectious nature of the consensus chimera 
indicates that the regions of the 5' and 3' UTRs 
incorporated into the cassette vector do not destroy 
viability. This makes it highly advantageous to use the 
cassette vector to construct infectious cDNA clones of 
other HCV strains when the consensus sequence for each ORF 
is inserted. 

In addition, two complete full-length clones 
(dubbed pH21 and pGV-Hll) constructed were not infectious, 
as shown by intrahepatic injection of chimpanzees with the 
corresponding RNA transcripts. Thus, injection of the 
transcription mixture into 3 sites of the exposed liver 
resulted in no observable HCV replication and weekly serum 
samples were negative for HCV RNA at weeks 1 - 17 p.i. in 
a highly sensitive nested RT-PCR assay. The cDNA template 
injected along with the RNA transcripts was also not 
detected in this assay. 

Moreover, the chimpanzee remained negative for 
antibodies to HGV throughout the follow-up. Subsequent 
sequence analysis revealed that 7 of 16 additional clones 
were defective for polyprotein synthesis and all clones 
had multiple amino acid mutations compared with the 
consensus sequence of the parent strain. For example, 
clone pH21, which was not infectious, had 7 amino acid 
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substitutions in the entire predicted polyprotein compared 
with the consensus sequence of H77 (Fig. 3) . The most 
notable mutation was at position 1026, which changed L to 
Q, altering the cleavage site between NS2 and NS3 (Reed 
(1995)). Clone pCV-Hll, also non-inf ectious , had 21 amino 
acid substitutions in the predicted polyprotein compared 
with the consensus sequence of H77 (Fig. 3) . The amino 
acid mutation at position 564 eliminated a highly 
conserved C residue in the E2 protein (Okamoto (1992a)). 
EXAMPLE 4A 

The chimpanzee of Example 4, designated 1530, 
was monitored out to 32 weeks p.i. for serum enzyme levels 
(ALT) and the presence of anti-HCV antibodies, HCV RNA, 
and liver histopathology . The results are shown in Figure 
18B. 

A second chimp, designated 1494, was also 
transfected with RNA transcripts of the pCV-H77C clone and 
monitored out to 17 weeks p.i. for the presence of anti- 
HCV antibodies, HCV RNA and elevated serum enzyme levels. 
The results are shown in Figure 18A. 

MATERIALS AND METHODS 
for Examples 5-10 
Source Of HCV Genotype lb 

An infectious plasma pool (second chimpanzee 
passage) containing strain HC-J4, genotype lb, was 
prepared from acute phase plasma of a chimpanzee 
experimentally infected with serum containing HC-J4/91 
(Okamoto et al . , 1992b). The HC-J4/91 sample was obtained 
from a first chimpanzee passage during the chronic phase 
of hepatitis C about 8 years after experimental infection. 
The consensus sequence of the entire genome, except for 
the very 3' end, was determined previously for HC-J4/91 
(Okamoto et al . , 1992b). 
Preparation Of HCV RNA 

Viral RNA was extracted from 100 ^tl aliquots of 
the HC-J4 plasma pool with the TRIzol system (GIBCO BRL) . 
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The RNA pellets were each resuspended in 10 ^1 of 10 mM 
dithiothreitol (DTT) with 5% (vol/vol) RNasin (20-40 
units/^1) (Promega) and stored at -80 'C or immediately 
used for cDNA synthesis. 

Amplification And Cloning Of The 3^ UTR 

A region spanning from NS5B to the conserved 
region of the 3' UTR was amplified in nested RT-PCR using 
the procedure of Yanagi et al . , (1997). 

In brief, the RNA was denatured at 65 'C for 2 
minutes, and cDNA was synthesized at 42 'C for 1 hour with 
Superscript II reverse transcriptase (GIBCO BRL) and 
primer H3'X58R (Table 1) in a 20 [il reaction volume. The 
cDNA mixture was treated with RNase H and RNase Tl (GIBCO 
BRL) at 37 *C for 20 minutes. The first round of PCR was 
performed on 2 fil of the final cDNA mixture in a total 
volume of 50 pil with the Advantage cDNA polymerase mix 
(Clontech) and external primers H9261F (Table 1) and 
H3'X58R (Table 1) . In the second round of PCR [internal 
primers H9282F (Table 1) and H3'X45R (Table 1)], 5 fil of 
the first round PCR mixture was added to 45 fil of the PCR 
reaction mixture. Each round of PCR (35 cycles), was 
performed in a DNA thermal cycler 480 (Perkin Elmer) and 
consisted of denaturation at 94 *C for 1 minute (1st cycle: 
1 minute 30 sec), annealing at 60 *C for 1 minute and 
elongation at 68 *C for 2 minutes. After purification with 
QIAquick PCR purification kit (QIAGEN) , digestion with 
Hindi II and Xbal (Promega) , and phenol /chloroform 
extraction, the amplified products were cloned into 
pGEM-9zf(-) (Promega) (Yanagi et al . , 1997). 
Amplification And Cloning Of The Entire ORF 

A region from within the 5' UTR to the variable 
region of the 3' UTR of strain HC-J4 was amplified by long 
RT-PCR (Fig. 1) (Yanagi et al . , 1997). The cDNA was 
synthesized at 42 *C for 1 hour in a 20 ^1 reaction volume 
with Superscript II reverse transcriptase and primer J4- 
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94 05R (5' -GCCTATTGGCCTGGAGTGGTTAGCTC-3 ' ) , and treated with 
RNases as above. The cDNA mixture (2 /il) was amplified by 
long PGR with the Advantage cDNA polymerase mix and 
primers Al (Table 1) (Bukh et al . , 1992; Yanagi at al . , 
1997) and J4-9398R (5'- 

AGGAraGCrTTAAGGCCTGGAGTGGTTAGCTCCCCGTTCA-3' ) . Primer J4- 
9398R contained extra bases {bold) and an artificial Aflll 
cleavage site (underlined) . A single PGR round was 
performed in a Robocycler thermal cycler (Stratagene) , and 
consisted of denaturation at 99 *C for 35 seconds, 
annealing at 67 'C for 30 seconds and elongation at 68*0 
for 10 minutes during the first 5 cycles, 11 minutes 
during the next 10 cycles, 12 minutes during the following 
10 cycles and 13 minutes during the last 10 cycles. 

After digesting the long PGR products obtained 
from strain HC-J4 with PinAI (isoschizomer of Agel) and 
Bfrl (isoschizomer of Ajflll) (Boehringer Mannheim) , 
attempts were made to clone them directly into a cassette 
vector (pCV) , which contained the 5' and 3' termini of 
strain H77 (Figure 1) but no full-length clones were 
obtained. Accordingly, to improve the efficiency of 
cloning, the PGR product was further digested with Bgl II 

(Boehringer Mannheim) and the two resultant genome 
fragments [L fragment: PinAl/BsrIII , nts 156 - 8935; S 
fragment: Bglll/Brfl, nts 8936 - 9398] were separately 
cloned into pCV (Figure 6) . 

DH5a competent cells (GIBCO BRL) were 
transformed and selected on LB agar plates containing 100 
fig/ml ampicillin (SIGMA) and amplified in LB liquid 
cultures at 30 'C for 18-20 hours. 

Sequence analysis of 9 plasmids containing the S 
fragment (miniprep samples) and 9 plasmids containing the 
L fragment (maxiprep samples) were performed as described 
previously (Yanagi et al . , 1997). Three L fragments, each 
encoding a distinct polypeptide, were cloned into pCV-J4S9 

(which contained an S fragment encoding the consensus 
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amino acid sequence of HC-J4) to construct three chimeric 
full-length HCV cDNAs (pCV-J4L2S, pCV-J4L4S and pCV-J4Ii6S) 
{Fig. 6) . Large scale preparation of each clone was 
performed as described previously with a QIAGEN plasmid 
Maxi kit {Yanagi et al . , 1997) and the authenticity of 
each clone was confimned by sequence analysis. 
Sequence Analysis 

Both strands of DNA were sequenced with the ABI 
PRISM Dye Termination Cycle Sequencing Ready Reaction Kit 
using Taq DNA polymerase (Perkin Elmer) and about 90 
specific sense and antisense primers. Analyses of genomic 
sequences, including multiple sequence alignments and tree 
analyses, were performed with GeneWorks (Oxford Molecular 
Group) (Bukh et al . , 1995). 

The consensus sequence of strain HC-J4 was 

determined by direct sequencing of PGR products (nts 11 - 

9412) and by sequence analysis of multiple cloned L and S 

fragments (nts 156 -9371) . The consensus sequence of the 

3' UTR (3' variable region, polypyrimidine tract and the 

first 16 nucleotides of the conserved region) was 

determined by analysis of 24 cDNA clones. 

Intrahepatic Transfection Of A Chimpanzee 
With Transcribed RNA 

Two in vitro transcription reactions were 
performed with each of the three full-length clones. In 
each reaction 10 ng of plasmid DNA linearized with Xba I 
(Promega) was transcribed in a 100 fil reaction volume with 
T7 RNA polymerase (Promega) at 37 *C for 2 hours as 
described previously (Yanagi et al . , 1997). Five fil of 
the final reaction mixture was analyzed by agarose gel 
electrophoresis and ethidium bromide staining (Fig. 5) . 
Each transcription mixture was diluted with 400 fil of 
ice-cold phosphate -buffered saline without calcium or 
magnesium and then the two aliquots from the same cDNA 
clone were combined, immediately frozen on dry ice and 
stored at -80 *C. Within 24 hours after freezing the 
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transcription mixtures were injected into the chimpanzee 
by percutaneous intrahepatic injection that was guided by- 
ultrasound. Each inoculum was individually injected (5-6 
sites) into a separate area of the liver to prevent 
complementation or recombination. The chimpanzee was 
maintained under conditions that met all requirements for 
its use in an approved facility. 

Serum samples were collected weekly from the 
chimpanzee and monitored for liver enzyme levels and 
anti-HCV antibodies. Weekly samples of 100 fil of serum 
were tested for HCV RNA in a sensitive nested RT-PCR assay 
(Bukh et al., 1992, Yanagi et al . , 1996) with AmpliTaq 
Gold DNA polymerase. The genome equivalent (GE) titer of 
HCV was determined by testing 10-fold serial dilutions of 
the extracted RNA in the RT-PCR assay (Yanagi et al., 
1996) with 1 GE defined as the number of HCV genomes 
present in the highest dilution which was positive in the 
RT-nested PGR assay. 

To identify which of the three clones was 
infectious in vivo , the NS3 region (nts 3659 - 4110) from 
the chimpanzee serum was amplified in a highly sensitive 
and specific nested RT-PCR assay with AmpliTaq Gold DNA 
polymerase and the PGR products were cloned with a TA 
cloning kit (Invitrogen) . In addition, the consensus 
sequence of the nearly complete genome (nts 11 - 9441) was 
determined by direct sequencing of overlapping PGR 
products . 

EXAMPLE 5 

Sequence Analysis Of Infectious Plasma Pool 
Of Strain HC-J4 Used As The Cloning Source 

As an infectious cDNA clone of a genotype la 
strain of HCV had been obtained only after the ORF was 
engineered to encode the consensus polypeptide (Kolykhalov 
et al . , 1997; Yanagi et al . , 1997), a detailed sequence 
analysis of the cloning source was performed to determine 
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the consensus sequence prior to constructing an infectious 
cDNA clone of a lb genotype. 

A plasma pool of strain HC-J4 was prepared from 
acute phase plasmapheresis units collected from a 
chimpanzee experimentally infected with HC-J4/91 (Okamoto 
et al., 1992b) . This HCV pool had a PGR titer of IC* - 10^ 
GE/ml and an infectivity titer of approximately 10^ 
chimpanzee infectious doses per ml . 

The heterogeneity of the 3' UTR of strain HC-J4 
was determined by analyzing 24 clones of nested RT-PCR 
product. The consensus sequence was identical to that 
previously published for HC-J4/91 (Okamoto et al., 1992b), 
except at position 94 07 (see below) . The variable region 
consisted of 41 nucleotides (nts. 9372 - 9412), including 
two in-frame termination codons. Furthermore, its 
sequence was highly conserved except at positions 9399 (19 
A and 5 T clones) and 9407 (17 T and 7 A clones) . The 
poly U-UC region varied slightly in composition and 
greatly in length (31-162 nucleotides) . In the conserved 
region, the first 16 nucleotides of 22 clones were 
identical to those previously published for other genotype 
1 strains, whereas two clones each had a single point 
mutation. These data suggested that the structural 
organization at the 3' end of HC-J4 was similar to that of 
the infectious clone of a genotype la strain of Yanagi et 
al (1997) . 

Next, the entire ORF of HC-J4 was amplified in a 
single round of long RT-PCR (Figure 5) . The original plan 
was to clone the resulting PGR products into the PinAI and 
Brfl site of a HGV cassette vector (pGV) , which had fixed 
5' and 3' termini of genotype la (Yanagi et al . , 1997) but 
since full-length clones were not obtained, two genome 
fragments (L and S) derived from the long RT-PCR products 
(Figure 6) were separately subcloned into pCV. 

To determine the consensus sequence of the ORF, 
the sequence of 9 clones each of the L fragment (pCV-J4L) 
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and of the S fragment (pCV-J4S) was determined and 
qua£3ispecies were found at 275 nucleotide (3.05 %) and 78 
amino acid (2.59 %) positions, scattered throughout the 
9030 nts (3010 aa) of the ORF (Figure 7) . Of the 161 
nucleotide substitutions unique to a single clone, 71% 
were at the third position of the codon and 72 % were 
silent . 

Each of the nine L clones represented the near 
complete ORF of an individual genome. The differences 
among the L clones were 0.30 - 1.53% at the nucleotide and 
0.31 - 1.47% at the amino acid level (Figure 8). Two 
clones, Ll and L7, had a defective ORF due to a single 
nucleotide deletion and a single nucleotide insertion, 
respectively. Even though the HC-J4 plasma pool was 
obtained in the early acute phase, it appeared to contain 
at least three viral species (Figure 9) . Species A 
contained the Ll, L2, L6, L8 and L9 clones, species B the 
L3, L7 and LIO clones and species C the L4 clone. 
Although each species A clone was unique all A clones 
differed from all B clones at the same 20 amino acid sites 
and at these positions, species C had the species A 
sequence at 14 positions and the species B sequence at 6 
positions (Figure 7) . 

Okamoto and coworkers (Okamoto et al . , 1992b) 
previously determined the nearly complete genome consensus 
sequence of strain HC-J4 in acute phase serum of the first 
chimpanzee passage (HC-J4/83) as well as in chronic phase 
serum collected 8.2 years later (HC-J4/91) . In addition, 
they determined the sequence of amino acids 379 to 413 

(including HVRl) and amino acids 468 to 4 86 (including 
HVR2) of multiple individual clones (Okamoto et al . , 

1992b) . 

It was found by the present inventors that the 
sequences of individual genomes in the plasma pool 
collected from a chimpanzee inoculated with HC-J4/91 were 
all more closely related to HC-J4/91 than to HC-J4/83 
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(Figures 8, 9) and contained HVR amino acid sequences 
closely related to three of the four viral species 
previously found in HC-J4/91 (Figure 10) . 

Thus, the data presented herein demonstrate the 
occurrence of the simultaneous transmission of multiple 
species to a single chimpanzee and clearly illustrates the 
difficulties in accurately determining the evolution of 
HCV over time since multiple species with significant 
changes throughout the HCV genome can be present from the 
onset of the infection. Accordingly, infection of 
chimpanzees with monoclonal viruses derived from the 
infectious clones described herein will make it possible 
to perform more detailed studies of the evolution of HCV 
in vivo and its importance for viral persistence and 
pathogenesis . 

EXAMPLE 6 

Determination Of The Consensus 
Sequence Of HC-J4 In The Plasma Pool 

The consensus sequence of nucleotides 156-9371 
of HC-J4 was determined by two approaches. In one 
approach, the consensus sequence was deduced from 9 clones 
of the long RT-PCR product. In the other approach the 
long RT-PCR product was reamplified by PGR as overlapping 
fragments which were sequenced directly. The two 
"consensus" sequences differed at 31 (0.34%) of 9216 
nucleotide positions and at 11 (0.37%) of 3010 deduced 
amino acid positions (Figure 7) . At all of these 
positions a major quasispecies of strain HC-J4 was found 
in the plasma pool. At 9 additional amino acid positions 
the cloned sequences displayed heterogeneity and the 
direct sequence was ambiguous (Figure 7) . Finally, it 
should be noted that there were multiple amino acid 
positions at which the consensus sequence obtained by 
direct sequencing was identical to that obtained by 
cloning and sequencing even though a major quasispecies 
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was detected {Figure 7) . 

For positions at which the two "consensus" 
sequences of HC-J4 differed, both amino acids were 
included in a composite consensus sequence (Figure 7) . 
However, even with this allowance, none of the 9 L clones 
analyzed (aa 1 - 2864) had the composite consensus 
sequence : two clones did not encode the complete 
polypeptide and the remaining 7 clones differed from the 
consensus sequence by 3 - 13 amino acids (Figure 7) . 

EXAMPLE 7 

Construction Of Chimeric Full -Length cDNA 
Clones Containing The Entire ORF Of HC-J4 

The cassette vector used to clone strain H77 was 
used to construct an infectious cDNA clone containing the 
ORF of a second subtype. 

In brief, three full-length cDNA clones were 
constructed by cloning different L fragments into the 
PinAl/Bglll site of pCV-J4S9, the cassette vector for 
genotype la (Figure 6) , which also contained an S fragment 
encoding the consensus amino acid sequence of HC-J4. 
Therefore, although the ORF was from strain HC-J4, most of 
the 5' and 3' terminal sequences originated from strain 
H77. As a result, the 5' and 3' UTR were chimeras of 
genotypes la and lb (Figure 11) . 

The first 155 nucleotides of the 5' UTR were 
from strain H77 (genotype la) , and differed from the 
authentic sequence of HC-J4 (genotype lb) at nucleotides 
11, 12, 13, 34 and 35. In two clones (pCV-J4L2S, pCV- 
J4L6S) the rest of the 5' UTR had the consensus sequence 
of HC-J4, whereas the third clone (pCV-J4L4S) had a single 
nucleotide insertion at position 207. In all 3 clones the 
first 27 nucleotides of the 3' variable region of the 3' 
UTR were identical with the consensus sequence of HC- J4 . 
The remaining 15 nucleotides of the variable region, the 
poly U-UC region and the 3' conserved region of the 3' UTR 
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had the same sequence as an infectious clone of strain H77 
(Figure 11) . 

None of the three full-length clones of HC-J4 
had the ORF composite consensus sequence (Figures 7, 12) . 
The pCV- J4L6S clone had only three amino acid changes : Q 
for R at position 231 (El), V for A at position 937 (NS2) 
and T for S at position 1215 (NS3) . The pCV-J4L4S clone 
had 7 amino acid changes, including a change at position 
450 (E2) that eliminated a highly conserved N- linked 
glycosylation site (Okamoto et al . , 1992a). Finally, the 
pCV-J4L2S clone had 9 amino acid changes compared with the 
consensus sequence of HC- J4 . A change at position 3 04 
(El) mutated a highly conserved cysteine residue (Bukh et 
al., 1993; Okamoto et al . , 1992a). 

EXAMPLE 8 

Transfection Of A Chimpanzee By In 
Vitro Transcripts Of A Chimeric cDNA 

The infectivity of the three chimeric HCV clones 
was determined by ultra- sound-guided percutaneous 
intrahepatic injection into the liver of a chimpanzee of 
the same amount of cDNA and transcription mixture for each 
of the clones (Figure 5) . This procedure is a less 
invasive procedure than the laparotomy procedure utilized 
by Kolykhalov et al . (1997) and Yanagi et al . (1997) and 
should facilitate in vivo studies of cDNA clones of HCV in 
chimpanzees since percutaneous procedures, unlike 
laparotomy, can be performed repeatedly. 

As shown in Figure 13 , the chimpanzee became 
infected with HCV as measured by increasing titers of 10^ 
GE/ml at week 1 p.i., 10^ GE/ml at week 2 p.i. and 10'* - 
10^ GE/ml during weeks 3 to 10 p.i. 

The viremic pattern found in the early phase of 
the infection was similar to that observed for the 
recombinant H77 virus in chimpanzees (Bukh et al . , 
unpublished data; Kolykhalov et al . , 1997; Yanagi et al . , 
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1997) . The chimpanzee transfected in the present study- 
was chronically infected with hepatitis G virus (HGV/GBV- 
C) (Bukh et al., 1998) and had a titer of 10^ GE/ml at the 
time of HCV transf ection. Although HGV/GBV-C was 
originally believed to be a hepatitis virus, it does not 
cause hepatitis in chimpanzees (Bukh et al . , 1998) and may 
not replicate in the liver (Laskus et al . , 1997). The 
present study demonstrated that an ongoing infection of 
HGV/GBV-C did not prevent acute HCV infection in the 
chimpanzee model . 

However, to identify which of the three full- 
length HC-J4 clones were infectious, the NS3 region (nts. 
3659 - 4110) of HCV genomes amplified by RT-PCR from serum 
samples taken from the infected chimpanzee during weeks 2 
and 4 post-infection (p-i.) were cloned and sequenced. As 
the PCR primers were a complete match with each of the 
original three clones, this assay should not have 
preferentially amplified one virus over another. Sequence 
analysis of 26 and 24 clones obtained at weeks 2 and 4 
p,i., respectively, demonstrated that all originated from 
the transcripts of pCV-J4L6S. 

Moreover, the consensus sequence of PCR products 
of the nearly complete genome (nts. 11-9441), amplified 
from serum obtained during week 2 p-i., was identical to 
the sequence of pCV-J4L6S and there was no evidence of 
quasispecies . Thus, RNA transcripts of pCV-J4L6S, but not 
of pCV-J4L2S or pCV-J4L4S, were infectious in vivo . The 
data in Figure 13 is therefore the product of the 
transf ection of RNA transcripts of pCV-J4L6S. 

In addition, the chimeric sequences of genotypes 
la and lb in the UTRs were maintained in the infected 
chimpanzee. The consensus sequence of nucleotides 11 - 
341 of the 5' UTR and the variable region of the 3' UTR, 
amplified from serum obtained during weeks 2 and 4 p.i., 
had the expected chimeric sequence of genotypes la and lb 
(Fig. 11) . Also three of four clones of the 3' UTR 
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obtained at week 2 p.i. had the chimeric sequence of the 
variable region, whereas a single substitution was noted 
in the fourth clone. However, in all four clones the poly 
U region was longer (2-12 nts) than expected. Also, extra 
C and G residues were observed in this region. For the 
most part, the number of C residues in the poly UC region 
was maintained in all clones although the spacing varied. 
As shown previously, variations in the number of U 
residues can reflect artifacts introduced during PGR 
amplification (Yanagi et al . , 1997). The sequence of the 
first 19 nucleotides of the conserved region was 
maintained in all four clones. Thus, with the exception 
of the poly U-UC region, the genomic sequences recovered 
from the infected chimpanzee were exactly those of the 
chimeric infectious clone pCV-J4BL6S. 

The results presented in Figure 13 therefore 
demonstrate that HCV polypeptide sequences other than the 
consensus sequence can be infectious and that a chimeric 
genome containing portions of the H77 termini could 
produce an infectious virus. In addition, these results 
showed for the first time that it is possible to make 
infectious viruses containing 5' and 3' terminal sequences 
specific for two different subtypes of the same major 
genotype of HCV. 

EXAMPLE 9 

Construction Of A Chimeric 
la/lb Infectious Clone 

A chimeric la/lb infectious clone in which the 

structural region of the genotype lb infectious clone is 

inserted into the la clone of Yanagi et al . (1997) is 

constructed by following the protocol shown in Figure 15 . 

The resultant chimera contains nucleotides 156-2763 of the 

lb clone described herein inserted into the la clone of 

Figures 4A-4F. The sequences of the primers shown in 

Figure 15 which are used in constructing this chimeric 

clone, designated pH77CV-J4, are presented below. 
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1. H2 751S (Cla T/Nde I) 

CGT CAT CGA TCC TCA GCG GGC ATA TGC ACT GGA CAC GGA 

2. H2870R 

CAT GCA CCA GCT GAT ATA GCG CTT GTA ATA TG 

3. H7851S 

TCC GTA GAG GAA GCT TGC AGC CTG ACG CCC 

4. H9173 R(P-M) 

GTA CTT GCC ACA TAT AGC AGC CCT GCC TCC TCT G 

5. H9140S (P-M) 

CAG AGG AGG CAG GGC TGC TAT ATG TGG CAA GTA C 

6. H9417R 

CGT CTC TAG ACA GGA AAT GGC TTA AGA GGC CGG AGT GTT TAG C 

7. J4-2271S 

TGC AAT TGG ACT CGA GGA GAG CGC TGT AAC TTG GAG 

8. .T4-2776R (Nde I) 

CGG TCC AAG GCA TAT GCT CGT GGT GGT AAC GCC AG 

Transcripts of the chimeric la/lb clone (whose sequence is 

shown in Figures 16A-16F) are then produced and 

trans fected into chimpanzees by the methods described in 

the Materials and Methods section herein and the 

trans fected animals are then be subjected to biochemical 

{ALT levels) , histopathological and PCR analyses to 

determine the infectivity of the chimeric clone. 

Construction of 3' Deletion Mutants 
Of The la Infectious Clone PCV-H77C 

Seven constructs having various deletions in the 
3' untranslated region (UTR) of the la infectious clone 
pCV-H77C were constructed as described in Figures 17A-17B. 
The 3' untranslated sequence remaining in each of the 
seven constructs following their respective deletions is 
shown in Figures 17A-17B. 

Construct pCV-H77C(-98X) containing a deletion 
of the 3' -most 98 nucleotide sequences in the 3'-OTR was 
transcribed in vitro according to the methods described 
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herein and 1 ml of the diluted transcription mixture was 
percutaneously transfected into the liver of a chimpanzee 
with the aid of ultrasound. After three weeks, the 
transfection was repeated. The chimpanzee was observed to 
be negative for hepatitis C virus replication as measured 
by RT-PCR assay for 5 weeks after transfection. These 
results demonstrate that the deleted 98 nucleotide 3'-UTR 
sequence was critical for production of infectious HCV and 
appear to contradict the reports of Dash et al . (1996) and 
Yoo et al. (1995) who reported that RNA transcripts from 
cDNA clones of HCV-1 and HCV-N lacking the terminal 98 
conserved nucleotides at the very 3' end of the 3 ' -UTR 
resulted in viral replication after transfection into 
human hematoma cell lines. 

Transcripts of the {-42X) mutant (Figure 17C) 
were also produced and transfected into a chimpanzee and 
transcripts of the other five deletion mutants shown in 
Figures 17D-17G) are to be produced and transfected into 
chimpanzees by the methods described herein. All 
transfected animals are to then be assayed for viral 
replication via RT-PCR. 
Discussion 

In two recent reports on transfection of 
chimpanzees, only those clones engineered to have the 
independently determined and slightly different consensus 
amino acid sequence of the polypeptide of strain H77 were 
infectious (Kolykhalov et al . , 1997; Yanagi et al., 1997). 
Although the two infectious clones differed at four amino 
acid positions, these differences were represented in a 
major component of the guasispecies of the cloning source. 
In the present study, a single consensus sequence of 
strain HC-J4 could not be defined because the consensus 
sequence obtained by two different approaches (direct 
sequencing and sequencing of cloned products) differed at 
2 0 amino acid positions, even though the same genomic PGR 
product was analyzed. The infectious clone differed at 
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two positions from the composite amino acid consensus 
sequence, from the sequence of the 8 additional HC-J4 
clones analyzed in this study and from published sequences 
of earlier passage samples. An additional amino acid 
differed from the composite consensus sequence but was 
found in two other HC-J4 clones analyzed in this study. 
The two non- infectious full-length clones of HC-J4 
differed from the composite consensus sequence by only 7 
and 9 amino acid differences. However, since these clones 
had the same termini as the infectious clone (except for a 
single nucleotide insertion in the 5' UTR of pCV-J4L4S) , 
one or more of these amino acid changes in each clone was 
apparently deleterious for the virus. 

It was also found in the present study that HC- 
J4, like other strains of genotype lb (Kolykhalov et al . , 
1996; Tanaka et al . , 1996; Yamada et al . , 1996), had a 
poly U-UC region followed by a terminal conserved element . 
The poly U-UC region appears to vary considerably so it 
was not clear whether changes in this region would have a 
significant effect on virus replication. On the other 
hand, the 3' 98 nucleotides of the HCV genome were 
previously shown to be identical among other strains of 
genotypes la and lb (Kolykhalov et al . , 1996; Tanaka et 
al . , 1996). Thus, use of the cassette vector would not 
alter this region except for addition of 3 nucleotides 
found in strain H77 between the poly UC region and the 3' 
98 conserved nucleotides. 

In conclusion, an infectious clone representing 
a genotype lb strain of HCV has been constructed. Thus, 
it has been demonstrated that it was possible to obtain an 
infectious clone of a second strain of HCV. In addition, 
it has been shown that a consensus amino acid sequence was 
not absolutely required for infect ivity and that chimeras 
between the UTRs of two different genotypes could be 
viable . 
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WHAT IS CLAIMED IS ; 

1. A purified and isolated nucleic acid 
molecule which encodes human hepatitis C viarus, said 
molecule capal>le of expressing said virus when transfected 
into cells. 

2. The nucleic acid molecule of claim 1, 
wherein said molecule encodes the amino acid sequence 
shown in Figures 14G-14H. 

3 . The nucleic acid molecule of claim 2 , 
wherein said molecule comprises the nucleic acid sequence 
shown in Figures 14A-14F. 

4 . The nucleic acid molecule acid molecule of 
claim 1, wherein said molecule encodes the amino acid 
sequence shown in Figures 4G-4H. 

5 . The nucleic acid molecule of claim 4 , 
wherein said molecule comprises the nucleic acid sequence 
shown in Figures 4A-4F. 

6. The nucleic acid molecule of claim l, 
wherein a fragment of said molecule which encodes the 
structural region of hepatitis C virus has been replaced 
by the structural region from the genome of another 
hepatitis C virus strain. 

7. The nucleic acid molecule of claim 6, 
wherein said molecule encodes the amino acid sequence 
shown in Figures 16G-16H. 

8. The nucleic acid molecule of claim 7, 
wherein said molecule comprises the nucleic acid sequence 
shown in Figures 16A-16F. 

9. The nucleic acid molecule of claim 1, 
wherein a fragment of the nucleic acid molecule which 
encodes at least one HCV protein has been replaced by a 
fragment of the genome of another hepatitis C virus strain 
which encodes the corresponding protein. 

10 . The nucleic acid molecule of claim 9 , 
wherein the protein is selected from the group consisting 
of El, E2 or NS4 proteins. 

344936J 

EI 086 492 698 US 



- 59 - 

11. The nucleic acid molecule of claim 1, 
wherein a fragment of the molecule encoding all or part of 
an HCV protein has been deleted. 

12. The nucleic acid molecule of claim 11, 
wherein the HCV protein is selected from the group 
consisting of P7, NS4B or NS5A proteins. 

13 . A DNA construct comprising a nucleic acid 
molecule according to claims 1, 3, 5 or 8. 

14. An RNA transcript of the DNA construct of 

claim 13 . 

15 . A cell transf ected with the DNA construct 
of claim 13. 

16. A cell transf ected with RNA transcript of 

claim 14 . 

17. A hepatitis C virus polypeptide produced by 
the cell of claim 15. 

18. A hepatitis C virus polypeptide produced by 
the cell of claim 16. 

19 . A hepatitis C virus produced by the cell of 

claim 13 . 

20. A hepatitis C virus produced by the cell of 

claim 14. 

21. A hepatitis C virus whose genome comprises 
a nucleic acid molecule according to claims 1, 3, 5, 6, 8, 
or 9 . 

22. A method for producing a hepatitis C virus 
comprising transfecting a host cell with the RNA 
transcript of claim 14. 

23 . A polypeptide encoded by a nucleic acid 
sequence according to claims 1, 2, 4 or 7 or a fragment 
thereof . 

24. The polypeptide of claim 23, wherein said 
polypeptide is selected from the group consisting of NS3 
protease. El protein, E2 protein or NS4 protein. 

25. A method for assaying candidate antiviral 
agents for activity against HCV, comprising 
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a) exposing a cell containing the hepatitis C 
virus of claim 21 to the candidate 
antiviral agent; and 

b) measuring the presence or absence of 
hepatitis C virus replication in the cell 
of step (a) . 

26. The method of claim 25, wherein said 
replication in step (b) is measured by at least one of the 
following: negative strand RT-PCR, quantitative RT-PCR, 
Western blot^ immunof luoresence, or infectivity in a 
susceptible animal . 

27. A method for assaying candidate antiviral 
agents for activity against HCV, comprising: 

a) exposing an HCV protease encoded by a 
nucleic acid sequence according to claims 
1, 2, 4, or 7, or a fragment thereof to the 
candidate antiviral agent in the presence 
of a protease substrate; and 

b) measuring the protease activity of said 
protease. 

28. The method of claim 27, wherein said HCV 
protease is selected from the group consisting of an NS3 
domain protease, an NS3-NS4A fusion polypeptide, or an 
NS2-NS3 protease. 

29. An antiviral agent identified as having 
antiviral activity for HCV by the method of claim 25. 

30. An antiviral agent identified as having 
antiviral activity for HCV by the method of claim 27. 

31. Antibody to the polypeptide of claim 23. 

32. Antibody to the hepatitis C virus of claim 

21. 

33 . A method for determining the susceptibility 
of cells in vitro to support HCV infection, comprising the 
steps of : 

a. growing animal cells in vitro; 

b. transfecting into said cells the nucleic 
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acid of claim 1; and 
c. determining if said cells show indicia of 
HCV replication. 

34. The method according to claim 33, wherein 
said cells are hioman cells. 

35. A cassette vector for cloning viral 
genomes, comprising, inserted therein, the nucleic acid 
sequence according to claim 2, said vector reading in the 
correct phase for the estpression of said inserted sequence 
and having an active promoter sequence upstream thereof. 

36. The cassette vector of claim 35, wherein 
the cassette vector is produced from plasmid pCV. 

37. The cassette vector of claim 35, wherein 
the vector also contains one or more expressible marker 
genes . 

38. The cassette vector of claim 35, wherein 
the inserted DNA sequence contains at least one ORF of the 
HCV genome from any strain . 

39. The cassette vector of claim 35, wherein 
the promoter is a bacterial promoter. 

40. A composition comprising a polypeptide of 
claim 23 suspended in a suitable amount of a 
phannaceutically acceptable diluent or excipient. 

41. A method for treating hepatitis C viral 
infection comprising the administration to a animal in 
need thereof of a clinically effective amount of the 
composition of claim 40. 

42 . A composition comprising a nucleic acid 
molecule of claim 1 suspended in a suitable amount of a 
pharmaceutically acceptable diluent or excipient. 

43 . A method for treating hepatitis C viral 
infection comprising the administration to an animal in 
need thereof of a clinically effective amount of the 
composition of claim 42. 
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ABSTRACT OF THE DISCLOSURE 
The present invention discloses nucleic acid 
sequences which encode infectious hepatitis C viruses and 
the use of these sequences, and polypeptides encoded by 
all or part of these sequences, in the development of 
vaccines and diagnostics for HCV and in the development of 
screening assays for the identification of antiviral 
agents for HCV. 
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i^. Genome of strain H77 of hepatitis C virus ^ yj[^ 
( 341 nt ) ( -220 nt ) 
\ 1 QRF (9,033 nt) | -(U-UC)n- 



Long RT-PCR ^ Nested RT-PCR^ 

H PGR product (-9.4kb) ^ 

i 




ORF 



Afl n Xba I 
]-^(U~UC)8lj 



Fuli-iength H77 cDNA constructs 
pH21 {-^Chimpanzee) 
pH50 

^ pH50 

5. yjR BamHI 3' UTR 



Cassette Vector 




Full-length H77 cDNA constructs 
pCV- H 1 1 (-► Chimpanzee) 
Four clones used for constructing pCV-H77C (infectious clone) 
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Q3::paj0032 T&nn3333^ 50 

QGAACTACIG TCTICAGQCA. GAAAGGGflCT A3aCAIGaGG TEAGTEMGAG 100 

TCIO^IQCAG CXTCCAQGAC (XXDCCX:^^ 150 

a3GAAca3Gr gagiacaccg ga™t30C?iG GACGAGcaas TOcrricriG 200 

GfOSW^OCOG CICAftlGOCr GGg^GMTTIQG QOGIQCXDCXr QCSJ^SACIGC 250 

"D^QCGGAGm GIGnOQGIC QOGAASOOaC TiUiUGIlACr QCCIG?m03 300 

GIQCnaOGA GIQGC0CX3GG AQGICTOSm Q^CCXJEGCAC CATCftGCACG 350 

AATCCIY^AAC CICAAAGAAA AACCAAAOST AACACCAACX: GnDQGOCACA 400 

GGAGCTCAAG TIOmSSIG Q0C3CIIX^^ 450 

TGCCOaQCAG QQQOGCEAGA TIQGGIGIQC OOOOGACGftG GAAGACTICC 500 

GAQOOGTCGC AACCTOGAQG TAGAGGICAG CCTAamrA AQCTAOGTOG 550 

GCXCGAGOQC AGGACCIQQG CTCAODGOaG GEADOCTIQG ODOCICEAIG 600 

GCAMISAQGG TIGOOOGIGG GGQGGAIQQC TXTGICICC QOGIQaCICT 650 

CGGCXICAGCr QGOaCCGCAC AGAC3CGGGQG CCTM3GTO3C QCAAnTGGG 700 

-U^AGGICATC GAT^rarriA CGIQCX3QCIT CXSGOGACXZEC ATOOSGIACA 750 

TACCXSCICCT OGQCQOaXT CnOGAGQCG CT3CX::^ia3GC GCTGGOGCAT 800 

QGCXJDGGQQG TIUTQGAAGA OSSCTGAAC TAIQCAAC?^ QGAAGCno: 850 

TCGriGCiur TiuiCTAiur TCXjriuroGC aiDGCiuicr tocctscts 900 

laxcxxnc agcciaccaa gigcqcaatt ocicasoGcr TiACCArGic 950 

ACCAATCA1T GCGCIAACIC GAGIATIGIG TACGAGGa33 COGAIGOCAT 1000 

CCIQCACACr CCX3GQGIGIG TGCCTIGCCT TOOOSOCT AAOaCTTCGA 1050 

QCJEGITQQSr QGaSi^IGACC 0X^033103 CCA0CAG9GA OGGCAAACIC 1100 

CCCACAACGC AGCTTOGAQG TCMMOGM? CiOCITGIOS OSmSQCAC 1150 

CCTCTGCIOG GCCCIUTACG TGGQQGACCT GIG03GCTCT GflL'i'i'iUi'iU 1200 

•nOOICAACr GrrmCCnC TCIOXMQC GCX3^iZD3GAC GA03CAAGAC 1250 

IGCAA L L'lGi'i' CIAICTZmX OQQCCATAaA AOSQ^ICATC QCMQGGAIIG 1300 

GGATAIGATC AIGAACIOCjr CCCCrP03GC i^303nG3IG GFEAGCICAGC 1350 

TGCICOaS^ CrCACAAGOC ATCAIOa^ TOCICACIQG 1400 

QGAGICCT3G OQQQCATAQC GTAi'i'iUiUC AiaGIG33GA ACIGQGOGAA 1450 

QCTCCIQGIA GIQCIGCIQC TATnOOaSG 01103^0303 GAAAOXAOG 1500 

TCADOQQGGG AAAIGOOQQC GQGACXIAOaG CTOOGCnGT TOSTCIOCIT 1550 

ACACCAQQOG CCAZ^GCAGAA CAIOCAACIG AICAAC^OCA ADOGa^GriG 1600 

GCACATCAAT AGCACQGOCT TGAATIGCAA TGAAJ^GCCIT AACACQQGCT 1650 

QGnTIAQCAQG QOIOi'iClAT CAACACAAAT TCAACTCTIC AGQCIGTOCT 1700 

GAGAQOTIGG COOnGGCG ADGCCTim: GAlTi'iUCCC i^GOGCIGGOG 1750 

TCCTATCAGT TATQCCA^^ GAAGCQQCXjr CGAOGAACX3C OXTACIGCr 1800 

QGCACIACCC TQCP^M^NXn: TCIOQCATTG TQCCXDQCAAA GZ^GCGTGIUT 1850 

QGCC033IAT AITQCnCAC TCQCAGOGQC 0103100103 3AA03AQ03A 1900 
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CAGGn?CX3QQC QOSQCTACCT ACAQCIQOQG TQCAAAIGAT AOQGAIIGflCr 1950 

TCGIOITim CCAGOrTOG QCAATIQGIT GOGHTIGiaa: 2000 

1QGA3X3AACT CAACIQGATT CACCAAAGIG T30QGAQa3C CXXDCTIGIGT 2050 

CAia3GAa3G GIQ3QCAACA. ACADCnOCT CIGGGGCACI GATIQCnGC 2100 

QCiWCAia: OGAAGOCACA. TACICIOaGT GOQQCiaDQG TCOCIQGATT 2150 

ACACXXZOoT QCA!IQGIOGA. CIS^CXXGIAT AG3Ji'i'iUQC ACEATCCTIG 2200 

TACCATCAAT TACACCA33ff TCAAAGICAG GAIGTAOCIEG QGAGGGGICG 2250 

AGGACAQQCr G3AAQC3330C TOCAACIQGA. OGOQQQGOGA AGQCIGIGKT 2300 

CIQGAAG?^ GOGSO^aSIC CGAQCICAQC aUilGC'IGC TCflDCACCAC 2350 

ACAGIQ3CAG GIDCITCOGr Gi'iUi'i'iLIAC GACOCIGOCA QOCTIOTOCA 2400 

aimxTCAT ocAoriacAC c^gaacatig iqgaccsigca. geacitoiac 2450 

QGQGTAGGGfr CAAQCJiTCGC GICCIQQGOC ATIAAGIQGG AGTADGTCGr 2500 

TCIGCIGITC CITCT3CriG CAGAaaOOOG GGICIQCraC T3CnGIQ3A 2550 

TCATCTEACr CAIAiaXAA Ga3G?i3Q033 CmOGAGAA. CCI03m?m 2600 

CICAAIQCAG CATCGCTGGC CQOGAOSCAC GCJCCITGIGr CCITCCIOGr 2650 

GITCnCIGC TnOGGIQGr ATCIGAAGQG TAQCT333IG CCa33AGCX3G 2700 

TCTACGCXXT CEACQGGAIG TGQOCICICC TCCTGCIOCT GCIQGGGTnG 2750 

CXnCAGCQQG CA.'MQCACr GGACACQGAG GIQGCXDGGGT OJEGIGGCXSG 2800 

CGITCTIdT GTOQQGITAA TOaCGCIGAC TCIGIOGCCA TATEACAAGC 2850 

GCEATATCAG CT3GIQCATC T33TQ3CriC AGEATTnCr GACCSO^GTA 2900 

GAAGCX3CAAC TGGAOJCGIG GGITCOCCQC CICAACCTGC 0330000003 2950 

CGA[iaCCGIC AICITACICA TCIGIGTAGT ACAOOOGACC CIGOIKTriG 3000 

ACAICACCAA ACIACTOCIG GCCAICnOO GACCXXTTIG GATICTICAA 3050 

GCCAGITIQC TTAAAGICGC CEACITOGIG aSOOTICAAG GQCnCIODG 3100 

GAIUToOGOG CTAGOQCQGA AGAOaQGCGG ^OJECATEAC GIGCAAATCG 3150 

CCAICAICAA GmOOGOOO CITACIQGCA CXHC^UiGm TAADCATCIC 3200 

ACCXTTCTTC GAGACIG30C GCACAACOQC CIQOGAGATC TOGOO^IGGC 3250 

IGflQGAACXIA GiaJECTXCT CCOGAAIQGA. GAOCAAGCIC AilCADGIQOG 3300 

GGGCAGATAC CQG030GIGC QGIGACATCA. TCAAOCSOCrr QOQOSIUICr 3350 

QCCCGmGOG QOCAQGAGAT ACIGCnOOG CCAGQQGAOG GAADI3GICIC 3400 

CAAGQQ3IGG AGGl'i Q CI G G OSQCCATCAC Q30GIAO3QC a«3CAGAD3A 3450 
GAQOOCICCr AOGGIGTAm ATCACCAGGC TGACIGGCOG QGACAAAAAC 3500 
CAAGIQGAQG GIGAGGIOC3V GAiajDGTDCA ACIGCEAOCC AAACCnCCT 3550 
G3CAACGTQC ATCAATO3GS TAIQCIGS^ 'iUiUilAOCAC Q9GQOCX3GAA 3600 
GGAQGACCAT OSCATCACCC AAQQCJCOCIG TCATCCAGAT GTATACCAAT 3650 
GIQGACCAAG ACCTIGIQQG CIGOCCOOCT CCICAAGGIT GCOGCIGATr 3700 
GA-CACCCIGT ACXZnmSOOT CCTOGGACCr TTACCIGGIC AQG?GGCAGG 3750 
CCGA.TOICA.T TCCQGIGGGC CGQQGAGOIG AI?GCAGQaG T?^i3GCIGCIT 3800 
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TCQGCCQOaC CnftTTTCCm CTIGAMGQC TCCTCQQaOG GIODQCIGIT 3850 

GIGOGCXDOOG QCSOmXG TOOOOCIMT CAQQQCXDQCG GIGIQCACGC 3900 

GIQGACT9QC TS^AAQOQSIG GACTTTAOXr CIGIQGAGAA CJCrAGOGACA 3950 

ACCATCAGAT QCXXX3GIGIT CAC3QGACAAC TOCICTCCAC CAQCAGIGOC 4000 

ccAGAQcnc cAQcsiGaaa: acciqcao^x loocAcaaac AcaasEAAGA 4050 

QCAOCAAQGr C]C03aCIQ0G TAC3aCAG00C AGQQCTACAA. OJiUi'iULfiU 4100 

CICAACXXCr CIUi'iGCIGC AAaxrraQQC TTIQGIQCIT ACATOTXAA 4150 

QQCCCAiaQS GTIGATCCm AEATCAQGAC OGG3GIGAGA ACAATEADCA. 4200 

CIQGCAGCXr GATCAOJCAC TCX3£CTADS OCAAGITQCr 1QGaGAC3QC 4250 

GGGIQCICAG GAGGIQCITA. TCACATAAaA ATTIGIGAC3G NHX^CCNJIC 4300 

CACQGAITGCC ACAICCAICT TG3aCA3X]Q3 CACIGTOCIT GAJXAAOSG 4350 

AGACIQCX3QG Q3QGAGACIG GITGIQCIDG GC30T3CTAC CCCICDCQOGC 4400 

lODGICACTG TCICCCATCC TAACATO3AG GAGSnOCIC TCIQCADGAC 4450 

CX3GAGAGATC CC XJi ' l'l'lA OG GCAAGGCIAT CCJQCXnOGAG GIGATCAAGG 4500 

QQQGAAGACA TCrOCATCITC T3GCACICAA AGAAGAAGIG CGAOGAGCIC 4550 

GCXDOGGAAQC TCGTGGCAIT GOGCATCAAT QCQGIQGOCr ACIAC30GQGG 4600 

lUITGACGIG TCIGICATCC CGAOCAGCQG OGATCITGIC GIOSIGTOGA 4650 

CX:GAIGCICr CATCACTQGC TITACCX3aCX2 ACnOGACIC TGIGATAGAC 4700 

TGCAACACGT GIGTCACrCA GACAGTOGAT TICAGCCriG AaXTCACCTT 4750 

TJOIMTGAG ACAACCAOGC TCCXXCAGSi TQCTCIUIO: AGGACICAAC 4800 

QCX33Q3GCAG SCTGGCAQG GQGAAGOCAG GCAICTATAG ATTTGIGGCA 4850 

Cm33QGAQC GCOCCTGQGG CATCITOGAC TOGiaCTX TCIGIGAGIG 4900 

CTATCAOSCG G3CIGIQCIT QGEATCAGCT CACGODOGCC Q¥3ACI20^ 4950 

TTAQCSrrACG AQC!GEACATC AACAQ0Q033 GGCnDGGGT GIGOCAGGAC 5000 

CAICITGAAT TTTGOGAGOG CGICITIMG QGCXTICACIC ATATAGAIQC 5050 

ccACirrrm tcqcagacaa aqcagagiog qgagaaotct GcnAanoG 5ioo 

TAGOGTACCA AQOCACOGIG IGGQCTAGOG CICAAGOOOC TCCGCCAIOG 5150 

IQQGACCAGA. TGflGGAAGIG TnGATOOQC CTEAAAOXA GCXJEOCAIGG 5200 

gcx::aacacoc ciqctaiaca gacigggocx: tgticagaat gaagicagcx: 5250 

tgagqcaoix: aatcaccaaa tacaicaiga. caigcaigic gggogaccig 5300 

GAGGIDGICA CGAQCACCIG QGIQCIDGIT GGOGQOGia: TOGCIGCICr 5350 

QGC03Q3IAT TGCCIGTCAA CAGGCTOOGT QGICAIZCT3 QGCAGGAIQ3 5400 

TUTIGIDOGG GAAQQQGGCA ATTATSOTG ACAGGGAGGT TCICIACCAG 5450 

GAGITOGAIG AGA303AAGA. GIGCICIC?>G CAdTACGGT ACATCGAGCA 5500 

AGQGATCATC CrCQCIGAQC AGITCAAGCA GAAGQQQCTC aGaCIGCIQC 5550 

AGACCGCXJEC CT^GQCAIGCA GAGGITATCA 000010^101 QCAG?^2CAAC 5600 

TCGCAGAAAC TCGAQ3ICIT TIGGGOGAAG CACMUIOGA ATTTCAIC?^ 5650 

TCGGA.TACAA TAdTOGGOG QCCIUTCAAC QOIGOOIGGT AAOCCOGCCA 5700 
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'I'l U L'i'iCAST GAIQOCmT ACftGCIQaOG TCACGAQOOC ACEAACGACT 5750 

Q3C]CAAACCX: TCCICITCAA CATAnQQQG GGGIQQSIQG CIQ03CAGCT 5800 

cx3aoacxxxDc GGiQaoocm cioacrriGr qqgiqcigqc cisgciqqog 5850 

CCX3CCAOXX3G CAQOCIETQGA CIQQQSAAGS TO?I03IQGA CALL'iL'i'iUCA 5900 

QGGrAIQQCG 0300031030 0Q3AQCICIT GEAGCAIITCA AGKCCATCRG 5950 

OQGIGAOOIC OOCICGAOaS AGGACCIQOr CAMCIQCIG OOaOOCATOC 6000 

TCICQOCIOO i^ODOCnom 0100010100 TCIQOQCAGC AKEftCIOaOC: 6050 

03QCA00nG QOOGGQOaGA QOOOOCAGIG CAMQGAIGA ACOGOCIS^ 6100 

AGCCna3QC IGOOGQQQGA ACCftLLUi'i'iL' OOOCACGCAC TftOOIQaOGG 6150 

AGAQOGAIQC JS^QaOQGGQQC GICACIQGCA TACHSGCAG OCIGACIGfm 6200 

ACOCAQCia: TSAQQCXSACT QCKTCAGIOG AIA^mimj AGIGIACCAC 6250 

TCCAIQCIOC Q3ITCCIGGC TAAGQGACAT CIQQGACIQG ATAIQCXSAQG 6300 

1OCIOAQ0GA CnTAAGADC TQ30IGAASG OCAAGCIGAT QQCACAACIG 6350 

CCIQGGATIC CCmOIOIC CIOGCAG03C 0301^000 0Q3ICIQGO3 6400 

AGGAGACQ3C ATTATGCACA CIOOCIQQCA CiOIOOAOCT GAGAICACIG 6900 

GACA2UICAA AAAaSOGAQG ATGAQGATCG TOaOBXTAG O^anOCAQG 6950 

AACATCIOGA GIGGGACGIT OGOCATIAAC GCCEACAGCA GGQGOCCCIG 6550 

TACrcCGCIT CCTGGQOOGA ACEA25^2^GIT OGOQCIGIGG AGGGIOICIG 6600 

CAGAGGAAm C!GIQGAGAIA AQGCQQGIGG QOGACTICCA CTAGGIAIOG 6650 

GGrCATCACTA CIGACAAICT TAAATCCXDOG TQCOOVia: CAIOQC30CX3A 6700 

A l'i'i'l'i CACA GAATIGGAOS QQGIQQGQCT ACACAQGITr GGGCCOQCIT 6750 

QCAAQCCCrr GCIQG3QGAG GAQGTATCAT IQiGSGrAQG ACICCAQGAG 6800 

TACrCQGIQG QGroQCAATT ACCnOOGAG ODQGAACOQG ADSIAQCOGT 6850 

GnGACGIOC AIQCICACIG AIGCCIOOCA TAIAACZ^GCA G?mDQ300G 6900 

QGAGAAGGIT GGCGAG?03G TCACCOOOIT CTAIQOOCftG CI00I03QCr 6950 

AGQCAGCIGT 0CX3CIGCAIC TCTCASGOCA ACnOCAODG OCAAOCAnGA 7000 

CIGCXXITCAC aGOGAGCrCA TAGAG3CIAA 0010010100 AGQCSGGPGA 7050 

TGGGOGOCAA CATCACCAGG GTIGAGICAG AGAACAAAGT GOIGATIOIG 7100 

GACIOOnOG ATCOOCnor Q0aO^GG?>G GAIGAG033G AGGICIOGOr 7150 

ACCIGCAGAA ATTCIQOGGA AGICIOGGPG AnOOCXXDOG GOOOIQOOOG 7200 

1CIQ330Q0G QGaOGACIAC AADG0Q003C TAGI^GAGAC GIOGAAAAAG 7250 

GCIGACIAD3 AACCAOOIGT GGIOCAIOOC TSOOQOCIAC CACOIGCAOG 7300 

GraCCCIOOr GIGOOKXGC OIOOGAAAAA GOGEAOGGIG GI OGICAOOG 7350 

AAlCAAOCOr AIUTACIGOG TTGGOOGAGG TIGQCAOIAA AAGirnOGC 7400 

AGCIOCIGAA CnOOQGCAT TAOQQGOGAC AATADGACAA CATCGICIGA 7450 

QCCC]GC02Cr TCTOGCT3QC CGCOOGACIC OOAOOITOAG TOOTATIdT 7500 

CCAIGGCGCC QCIQGAQQGG GAQGCIQ03G ATOOGGAIUr CAGGGAD3QG 7550 

TCAIOGIOGA Q33TCAGIAG TQGQQCOG^C ACQGA^O^ TQGIGIQCIG 7600 
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CDCAAIGICr TATICCIGGA. CAG3CX3CACr CXJICKnOOG TQOGCIQCQG 7650 

AAGAACAAAA ACIQCXXMC APOXftCHSV QCAACTOGIT GCmJSOCNT 7700 

CACAATCIQ3 IXjmTTCCPC CACTICAaaC i^GIQCriGCC AAMQC3^GAA. 7750 

GAAAGICACA TTIGACAGAC T3CAAGncr QGACftGQCAT TACC2^QGftCG 7800 

TQCICAAQGA. Q3ICAAAQCA QCXmJECAA AAGIG^^^QQC TAACTIQCm 7850 

TCQGIAGAQG AAQCITQCAG GCIGAOSaCX: COyZATICAG GCAAATOCAA 7900 

GmOGCEAT G3Q3CAAAAG ACGICOGITG aZAIGQCSGA AAQGQ0C?I5G 7950 

(DGCACATCAA. CICQGIGIQG AAAGAGCTIC IQGA^^GACAG IGEAJOOCA 8000 

ATAGACACm CC3^TC3^IOC3C CAAGAACGAG GTmCIQCG TICAQCXUIGA 8050 

GAAQQGQQCT OJCAAGOCAG CTmcCICAT OGflGTIQQCX: GACCIQQGCG 8100 

IGGOaSIGIG OSO^AGAIG QGOCIGEAOG AOSIGGTEAG CAAGCTCOOC 8150 

CIQGOG3IGA. 1GQGAAGCIC Cma3GATIC CAATACTCAC CAQGACAGCX5 8200 

GOTGAATIC CraGK3CAAG QGIGGAAGIC CAAGAAGAQC QCGAnX3QQC?r 8250 

TCIOJEMGA. TADOCGCICT TITSOXrA CAGICACIGA GAGCGACATC 8300 

QGIACGGAQG AGGCAATTTA CCAAIGTIGr GACCTQGAOC OXAAQGCm 8350 

CGIGGQCATC AAGTCQCICA CIGAGAOaCT TEATCTiaaG GGaXTCTTA 8400 

CCAATICAAG QGGQGAAAAC T303GCrA0C GC?GG?IGCCG a3D3AQQ3GC 8450 

GTACIGACAA CTAOIEGIQS TAACACCCIC ACITQCrACA TCAAOXXTG 8500 

GGCAQCXICGr QGAGGQGCAG GGCICCAQGA CTGCACCATC CIOGIGIGIG 8550 

GCGAQGACrr AGTCGITATC TGIGAAAGIG CQQGOGTCCA QGAQGAOSOS 8600 

GGGAQOCIGA GAQCCTTCAC GGAGQCIATC ACXZAGGTACT COGCQaOGCC 8650 

QGQQGAiXCC GCACAACCAG AAIAQ3ACIT QGAGCITAIA ACATCAJIOCT 8700 

CCICCAACGT GICAGrCQCC CAOGAOQQOS CIG3AAAGAG GGIUIACmC 8750 

CITACGCGIG AOXrCACAAC COaOCTOGOG N3^^G00303T GGGAGACAGC 8800 

AAGACACACr GCAGICAATT GCIQ3CI2^ CAACA22^ATC AIGITIQCCC 8850 

QCACACIGIG QGa3y3GAIG ATACHSOQ^ CTCAli'i'iUi'i' T?m3rOCIC 8900 

ATAGOCAGQG ATCAGCITGA ACAGGL'iCiT AACIGIGAGA ICTAOGG?^ 8950 

CIGCTACIOC ATAGAADCAC IGGAICTAO: TOCAATCATT CS^AAGACTC 9000 

ATGOOCICAG CGCALi'i'i'iCA CTOCACAGIT ACICTOCAOG TGAAAICAAO? 9050 

AGGGIGQODG CAIQOCICAG AAAACTIGQG GIOQaQCCCT TGGGAGCTIG 9100 

GAGACAGCQG QGCXiDQGAQGG TQOQOaCniG QCirCIGICC AGAGGAQGCA 9150 

QGGCIGQCAT ATGIG3CAAG TACCTCTICA ACIQGQCAGT AAGAACAAAG 9200 

CIGAAACTCA CTQCAAIAGC QQaoaCIQGC OQGCIQGACT IGiaDQGTIG 9250 

GITCACQGCT QGCTACAGOG QQGGASO^ TEATCACAGC GIGTCICAilG 9300 

CCa3GCCCXI3 CIGGnCIQG TmOGCEPC TOCIGCIOX: TGCAGQ3Gm 9350 

QGCAICIAOC TCCIGOCCAA CCGAIGAPG3 TIQGGGTAAA CACTOOGGOC 9400 

TCITAAQGCA TTTCCIGITr I'i'i'i'i'i'i'i'i'i' I'i'i'i'i'i'i'iTi' Ti'i'i'iUi'i'i'i' 9450 
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TOGIQGCIOC ATCITAQCXDC TAGICAOOQC TAOCIGIGAA AGGICOGflGA. 9550 
QCXDQCAIGAC TGCAGAGAGT QCTCftlACIG QOCTrcTCIQC AGAICAIGT 9599 
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MSINEKB;^ TKENINRRPQ DVKFEOBQQI VQG7YLLEFR GEE^DSVRZflR 50 

KTSERSQERG RRQPIEKZ\RR PECKEWAQPG YBm^RSMj OSmmLSP 100 

RGSRPSWGPT DPRFRSRNLG KVIIIIIiimF AEOCyiELV GAE]j3GAARA 150 

li^HSVRVLED O/NYMOSILP GCSFSIFLLA LLSCLIVPAS AYQVRNSSX 200 

YHVTNDCENS STTfEAMOMl LHTPQCVPCV RH3SIASRCWV AVTPIVAIRD 250 

GKLiETIQrjRR HTEXIMSSAT IjCSALY\CTL CGSWDJO^ FIFSPRRHWT 300 

1QDC3S1CSIYP aUTa^FMZ^ nmm^SFIA ALWAQLZIH PQ?02CMD^ 350 

AHWGVLAGIA. YFSMS/OSIWRK VLWLLLEAG VD?^BIHVIQG E?0?naGLV 400 

nr.T.TPnAKryr iqlininssw HTMSiaLigcN esiotgwlag LF^in^iKnsiss 450 

QCPEKLASCR RL1DF2^QG5^ PISYMCSGL DERPYCMCTP PRP03IVPAK 500 

SVQGEVYCFT PSPWVGTID R9GAPIYSWG iWIEVFVLN NIRPEDGNWF 550 

GCIWyiNSIGF TECV03APPCV IQGVQ^NTLL CFI!XFRKHP EATYSROSSG 600 

MTPRCMVD YPYRIMiYPC TUSIYTIFKVR MYVQC?^EHRL EA^OWTRGE 650 

RCDI^mrSRS ELSPLLLSrr QWQVLPCSFT TLPALSIGLI HEBgSIIVEVQ 700 

YLYGVGSSIA SWAIKWEYW LimLLADAR VCSCCAJyiMLL ISQAEAALEN 750 

LVHiSIAASLA GIH3LVSFLV FPCFAWYIKG FWVPGAVYML VTMaI PT J I ,T ,T ■ 800 

lALPQHAYAL DTEyAASCQG WLVGLMALT LSPYYKRYIS VOIaML^YFL 850 

TFIVEAQLHVW VPELNVRGSl DAVTT.TTCW HPTLVFOITK LEXAIFSELW 900 

UjQASmvVP YFVHVQCSLLR ICALARKE?^ OIYV^^IK DGALTSEWY 950 

NHLTPEiO^ HNGIRDLAVZ^ VEPWFSFME TECLriW3AE0? AAOSDIINGL 1000 

R/SARKSQEI LLGPADOWS KGWEILLAPIT AC^QQIRSLL GCIITSLTGR 1050 

DKNQVB3EVQ IVSTAT^IFL ATCINSVQVr VYHS^GIRIT ASPKGPVI<34 1100 

YIM/DQDLVG WPAPQGSRSL TPCTGGSSDL YLVTRHAEVI PVRRRODSBG 1150 

SIoLSPRPISY LKGSSGGPLL CPiOiAVGLF RAAVCIKm KAVDFIFVEN 1200 

ljC?nMRSPVF TENSSPPAVP QSPQVZ^HLHA PTGSQCSIKV PAAiAAQGYK 1250 

VLVINPSVAA HGF^GAYMSK AHSVDHSIIRr GVRiTilGSP nYSTYC3CFL 1300 

ADQGCSGG?^ DIIICDBCHS TDATSHGIG IVLIDQAErAG ARLWIATAT 1350 

PPGSVIVSHP NIEEVALSrr GEIPFYC2^ ELEVIK3a^ LIPCHSKKKC 1400 

DELAAKLVAL GINAVAYYEG m/SVIPTSG DWWSIDAL MrGFTODEDS 1450 

VIDCNTCVTQ TVDFSLDPTF TIEi'i'l'LPQD AVSRIQPPSl imSKPGIYR 1500 

FVAPGEE^SG MFDSSVLCEC YDAGO^^YEL TEAEnVRLH AYMSTTPGLW 1550 

CQEHIuEFWEG VFIGLIHIDA HFIiSQIKQSG EJ^FPYLVAYQ AIVCARAQAP 1600 

PPSWDQ^WKC LIRLKFIIB3 PTPLLYRLGA VgNEVTLTHP JTFYJmO^ 1650 

ADLEWTSIW VLVQGVLAAL AAYCLSIGCV VIVGRIVLSG KPAIIH3^ 1700 

LYQEFDEMEE CSQKiEYIEQ OXIMLABQEKQ KADGLLQIAS BHAE^/TTPAV 1750 

QIN^KLEVF WAKHMEWIS GIQYIAGLSr LPC3SEAIASL M?^TAAVrSP 1800 

LTIGQTLLFN IIGGWVAAQL AAPGAATAIV GAGLAGAAEG SVGDaWLVD 1850 

IIAGfGAG^ZA GALVAFEOMS GEVPSIEDLV NLLPAILSPG ALWGWCAA 1900 
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HERHVGFGE GAVgWMNRLI AFASPOSKVS FIHWPESHA AARVEAILSS 1950 

LTA^TQURPT, HSWISSECIT PCSGSWLRDI WDWICEVLSD FEdWLKMCLM 2000 

FQLEGIPFVS CQPS£RGVWR C3DGIMnPCH OGAEITSMC N3IMEU:\A][PR 2050 

1CHNMW9GIF PINZOTIGPC TELPMNYKF AUMFiySAEEY VEXPRWOTH 2100 

wsca-innsiL kcpoqipspe ffieudgv/rl hreappckel li^eevsffwg 2150 

IHEYPVGSQL PCEFEEEmV LTSMLIDPSH rEAEAA£3?FL ARGSPPS^ 2200 

SSASQLSAPS LKMCmNHD SEDA KT . TFT ^ UMQWHS^ TIRVESEMCV 2250 

VILDSFDPLV AEEDEREVSV PAEHEKSRR EARALPVWAR EDYNPELVET 2300 

WKKPD!£EPPV VH3CPLPPPR SPPVPPPRKK RIVVLTESIL STALAELAIK 2350 

SR^SISGI ICZNITrSSE PAPSQCPEDS EVESYSSMPP LS3EPC3DPnL 2400 

SDGSWSTv/SS ammJTSTCC SM5YSWIGAL VrPGAAEEQK L^>INALSNSL 2450 

lEHHNLVYST TSRSACQFQK KVTFERIjQVL DSHOTVLKE VKAAASKVKA 2500 

NLLSVEEACS LTPHiSAKSK PS£GAKEVPC HARKAV2SfflN SVWKESLLEDS 2550 

Vi'P lUi'l'lM A KNEVFCVQPE KGGKECPARLI VFEODSVEIVC EKMALYEWS 2600 

KLPIAVM3SS YGPQYSPGQR VEEXA/QA3>KS KKTHXGFSYD TRCFDSIVIE 2650 

SDIRIEEAIY QCXXiLDPQAR VAIKSLTERL YVQGPLINSR GE3SD3YFFCR 2700 

ASGVLTTSJG NTUICYIKAR AZ^CRAAGLQD CIMLVOXDL WICESAGVQ 2750 

EDAASLEAFT EAMIRYSAPP GDPPQPEYDL ELITSCSSNV SVZ^HDG?0<R 2800 

VYYLTERDPIT PLARAAWETA RHTBMSWLG NIIMFAPIUW ARMIL2y!IHFE 2850 

SVLIARDQLE QALNCEIYGA CYSIEPLDLP PIIQRIB3LS AFSLHSYSPG 2900 

EI^^RWAACXR KLGVPPLRAW RHRARSVRAR IjLSRQC3RAAI GOCmMAJAV 2950 

RTKIi^LTPIA AAGRLDLSGW FTAGYSGODI YHSVSHARPR WFWPCLLLCA 3000 

AGVGIYIUN R 3011 
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QOCNXXXXX: T&!I!C333^ 50 

QGAACIACIG TCTICAOGCA GAAAGGGICT AGOCATOGQG TIAGIMGAG 100 

TCTCGTOCAG CCIGCAQG?^ OOXXXTCCC QQGAGAQCXilA TAGIQGICIG 150 

CGGAACraJT GAGmCAODG GAATIGCCM3 G?O3AC0QQG lOJi'i'iL'i'iG 200 

GkTCPACCCG CICAATGCrr GGAGAmOG GCGIGCOOCC GGGAGACIO: 250 

TAGCXI3AGm GIUnOGGIC GOGAAAQQCC TIGIQCICACr QaCTCATAQG 300 

GTGCITCOGA GIGCCCa333 AQGTCIGGIA CATCAGCAa3 350 

AATGCmAAC CTCAAAGAAA AAOCAAACGT AACACCAACC GOOQCCryiCA. 400 

GG?\CGIGAAG TICCaSQGGG GIQGICAGAT OSTIGCSIQGA. GITEAGCIGr 450 

TCCXDSCXSCAG QGQaaCCAQG TIOQGIGIQC QCX303ACEAG GAAQQCTIOC 500 

G?m3Gia3C AAGCIGGIQG AAGGOGACAA CCIATCCX^^ AQQCinSQOG 550 

ACQCX3AGG9C AGQGCXIEQQG CICAGCCXDQG GmCXXTIGG CCQCTTTAIIG 600 

QCmiGAGOG CXnOOGGIGG GCAQGATGGC OmEGICACC GQQOaQCia: 650 

CGGCXTAGIT GGGQCOXAC GGACXXXTOG CCTAGGIDGC GEAACITOGG 700 

TAAGCTCATC GATACCCITA CA.1QQGGCIT GGCQGATCIC ATGQQCTACA 750 

TroCGCTOCT CX3GCGQODCC CTAQGQGQQG CIGCXZAQGQC CITGGGACAC 800 

GGIUroCXSGG TTCIQGAGG?^ CQGQGIGAAC 1ATQGAACAG QGAftCTIGCX: 850 

CGGITGCICr TICICIAIUr TCCICnOGC TCIQCIGICX: TGnTGAOIA 900 

TCGCAGCnC OGCTTATGAA GIGCQCAA33 TGTCCX3GGAT AIACTATCIC 950 

AGGAACGACr GCIOCAACIC AAGCATIGIG TATGAQGCAG CQGADC3IGAT 1000 

O^TGCATACr CCGGQGIGC^G TOCOCIGIGT TCAQGAGGGT AACAGCIOr 1050 

GTIGCTGGGr AGQGCICACT CCCACGCICG OGGdAQGAA T3CX:AGCX3IC 1100 

CCCACTA031 CAATAQGAOG CXIADGICGAC T]l3CICC?riG QGAQOGCTOC 1150 

TTTUiarrcC GCTATCTACG TOGOSGAICr CIQCQGA-TCT ATTTIDCTOG 1200 

TCICCCAGCr GTTCACCTTC TQQGCTGGGC Q3CATGAGAC AGIQCAQGAC 1250 

TGCAACIGCr CAATCTATCC CGQCCATCIA TCAGGICACC QCATOGCriG 1300 

GGATATGAIG ATGAACIGCT CA3CIACAAC AOXCIAGIG GIGiaSCAGT 1350 

TCCTCQQGAT OCCACAAGCT GI03IQGACA TOGIGQaOGG QQCXrACIQG 1400 

GGA3IGCIQG OGGGCCriGC CTAZCAITCC ATQCnAaQGA ACIQQQCIAA 1450 

GGTIUIGATT GIGGQGCIAC TCTTIGCXDQG a3nGACX3QG GAGACQCACA 1500 

CGACX9C3aGAG QGIQGCOQGC CACACX^^DCT GCX9C3GnCAC GICQL'l'i'i'iC 1550 

TCATCIGGQG OSICICAGAA AATCCAQCIT GIGAATACTA ADQQCAGCIG 1600 

GCACATCAAC AGGACIGCCC TAAATTGCAA TGATICOCIC CAAAnQOGT 1650 

ICnTGODGC GCIGITTTAC GCACAGAAGT TCAACIOGIC CQQ3IGC3CDG 1700 

GAGQGCATGG 0CAQCIQCXI3 GCCCAITCAC TGGTIDGOCX: AQQGGIGGQG 1750 

GCCCATGACC TATACTAAGC CTAACAGCIC GGATCAGAGG QCITATIQCr 1800 

GGCATIACX3C GOCIQGACCG TCIGGIGIOG TACCCQGGIC GCAGGIGICT 1850 

G3TXXAGIGT ATIUrnCAC QXAAQCXTT GTIGIQGIQG QGA3CACXISA 1900 
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TCGncaOGT GTOCXTTACGr ATAGCro33G QGAGAAIGAG ACAGAOGIGA 1950 

TOCIOCTCAA CAACAOGOSr CaOOCACAAG QCAACIQGIT 2000 

TOGA1GAATA. GIACTOQGrr CACTAAGftCG IGCXSGAQGIC ODOOGIGIAA. 2050 

CATCQGGQQG GTOOGTAACC QCADCTIGAT CIGQCOCAOG GACIGCTIDC 2100 

QGAAQCJCQC OGAQQCTACr TACACAAAAT GIGQCia3QG QOCCIQGnG 2150 

AO^DCTAQCSr GCXICAGEAGA CmCOSfflAC AOQL'i'i'iUQC ACEAiXDCTG 2200 

CAC?ICICAAT TTITOCMCr TEftAQGTEAG GALLUimUiG QQQ3G0C3IQG 2250 

AGGACAQQCr CAAIQCnSCA. TQCAATIQGA CICGftGGftGA. GCX3CIGEAftC 2300 

TIQGAGQO^ QQGATAGGIC AS^ACICAQC aOGCIQCIQC TCICIACAAC 2350 

AGAGIOSCftG AEACIGGOCr GIGCTITCAC CAOOCTAOOG QCmATCCA 2400 

CTQCTTIGAT GCmCTOCAT CAGAACZflOG T3GADGIQCA ifflADCIUEAC 2450 

QGIGEAGGGT CAQOGrnGr CICCmOCA ATCAAAIQQG AGIACAIOCr 2500 

Gnocrmc citcicctqg cagadgogcg cgigigigoc tqctigioga 2550 

TGAIGCIGCT QVTAGCCCAG GCIGAQQCXDG CCITAGAGAA CITGGTOGIC 2600 

CICAMG03G 0GIC03IQGC GQGAGCX3CAT QGIATIUICr CCmCTIUr 2650 

Cjl ' lUl'l CT3C GCCX3QCT33T ACATEAAQQG CAQQCIQQCT OCIGOOaoaG 2700 

QGIATCCriT TTATOGOSEA TQGQQ3CT3C TCCIQCIOJ!? ACTOQaGITA 2750 

CCACCACGA3 CTTAOGCXTr GGACXX3QGAG AIGOCIGCAT (JJIQ033332 2800 

TGCGGTTCIT GTAGGTUIGG TATIUTIGAC CITCTCACm TACTACAAAG 2850 

TCinUICAC TAQGCICATA TGGTOGITAC AATACTTTAT CACCA30X 2900 

GAGGCGCACA TCCAA3IGIG GGTCCXXXXX: CICAACX3TTC QQGS^GGCX33 2950 

CGATGCCATC ATGCTCCICA CGIGimSGT TCATCCAGAG TIAAi'i'i'i'iG 3000 

ACATCACCAA ACTOCTGCIC GOCATACIOG GCXXX3CICAT GCTGCIGCAG 3050 

GCIQGCATAA CGAGAGIGCC GIACTTCGIG QGOGCICAAG QXTCATIDG 3100 

TCCATGCATG TTAGTGQGAA AAGTCGQOQG QGGICATIAT GTOCAAATOG 3150 

TCTICATCAA GCIQ3GQ3CG CIGACAQ3EA QCTAQGTTIA TAADCAICIT 3200 

ACCCCACIGC QOGACTOGOC CXACGaOGGC CTAGGAGAIX: TIG0GCIP3GC 3250 

GGIAGAGGCC GIGGICTTCT CQGCCAT3GA C^^CCAAGGIC AICADCTQQG 3300 

GAQCAGACAC OGCIGOSIGT GGQGACAICA TCTIGGGICT AGQCGTCIOC 3350 

GCXDCGAAQQG QGAAGGA3AT Ai'i'i'i'iQGGA ODQQCIGATA GICI03AAQ3 3400 

QCAAQQGIQG CGACTOCTIG COCCCKTCNZ GGCXnACTQC CAACAAA03C 3450 

GGGGCGIACr TGGTIGCATC AICACEAGCC ICACAGGOCG GGACAAGAAC 3500 

CAGGICGAAG GGGAQGITCA AGIGGITICr ACCX3GAACAC AATCTITCCT 3550 

GGOGAOCIGC ATCAACGQOG TUIGCTGGAC TSIdAGCAT GGCXSCIQGCT 3600 

CGAAGAOXT AGCXX3GICCA AAAQCTQCAA 'ICACCX:AAAT GIACACCAAT 3650 

GTAGACCIGG ACCIOCT09G CTOGCAGGOG (DaDGGCQOGG CG03CICCAT 3700 

GACAGCATCC AGCIGTCGCA GCTCQGAQCT TTACnOGIX: AGGAGACATG 3750 

CTGATCTCAT TG0GCTGCX3C CGGQGAGGOG ACAGCAGGGG AAGTCIACIC 3800 
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IDQOXAQQC 'OOGICIDCm GCIGAAAGQC TOCTOGQGIG GTOCATIGCT 3850 

TIGOQCnCG GQQCA03IOS IQQQCGIdT GOQQQCIQCr GIGIGCAGOC 3900 

G33333I03C GAAGGa3GIG GACTICAIIAC QOGTIGAGIC TAIQGAAACT 3950 

ADCAIGQQCT CTOCXSGICTr CACAGACAAC ICAACGOaQC OQCXTOIAQC 4000 

QCSOCATIC CAAGIQQCAC A3IID3CA03C TCCiaCIQQC AQ03QCAAGA 4050 

QOMTPAMJD QOaQaCIQQG TATOCAGCXX: AAGQGEACAA. a3IQ0IO3IC 4100 

CTCAAODQCSr CnSnOOOQC CAGCITTmS TnOOQCmr AOMlSroCAA 4150 

G3CACACQ3r ATDGADOCIA ACAICAGAAC TOGaGEAAQG AOCATEACCA 4200 

a3QQCX3QCIC CATUmCAC TOCACTTAIG QCAAGTICCr 1Q00GAO9Gr 4250 

GGCIUITCIG QOGOOBCrTA IGACATCAm ATAIGIGATC AGIQQCACIC 4300 

AACIGACIOG ACmCX2m:n: T3Q3CAia3G CACAGIOIEG GACXS^AQOGG 4350 

AGAOGQCIGG AG03CX9QCIC 0103100103 0CAC03CEAC ADC3XXX3QGA 4400 

TGGOmCXDG lOOCACACOC CAAIAIOGAG GAAA33^a3CX: lOIOCAACAA 4450 

1GGAGAGATC OXTIOIAIG GCAAAGCX2ff OCX^CATIGAG QQCATCAAQG 4500 

GGGGGAGGCA. TCIGATITIC TQCCMTCCA AGAAGAAAIO 'KS^OGAQOIC 4550 

GCQQCAAAGC lOACAQGCOT OQGACIGAAC GOIOIAGCAT ATTACOGOGG 4600 

CCnOATCIG TOOGICATAC CGCCIATQQG AGAG3IOSIT GIOGIQ3CAA 4650 

CAGAOGCICr AATCAOSGOr TICACrXSGOG ATTTIOACIO AGIGATO3AC 4700 

TCCAATACAT GIUICAOOGA GACAGIOGAC TICAQCTIOG ATCGCAOCIT 4750 

CACCATT3\G AQGAOGACQG TGOOQCAAGA 03GQGIGIDG OGCICGCAAC 4800 

QGOGAQ3EAG AACIOQCAGG GGIAGGAGIG QCATCIACAG GITIOIOACr 4850 

OIIAQGAGAAC GGCXrTCGGG CATCITOGAT TCTIOGGIOC 10IGIGAGIG 4900 

CTATGACGCG GGCIGIOCIT QGrATCAQCT CAOGQCCXSCr GAGAGCI03G 4950 

TTAQGnOCG QGOTTAOOrA AATACAGCAG QGITGCCQGr OIOOCAGGAC 5000 

CATCrGGAGT TCIGGGAGAG CGIOTIOACA GGQOICAQOC ACATAGATOC 5050 

C^CACnOCIG TOOCAGACIA AACAGGCAQG AGACAACTTT QCITACCIQG 5100 

TCGCATATCA AGCIACAGIG T30GCCAQQG CICAAGCIOC ADCTOCATCG 5150 

TOGGAQCAAA TGIOGAAGIG TCICATADQG CT3AAACCTA CACIQCAOQG 5200 

GCCAACAOX: CTOCIGTATA GGCTAGGAQC OGIOCAAAAT GAQGICATQC 5250 

TCACACACCC CATAACEAAA TACATCATOG CATGCATGIC QGCIGACCIG 5300 

GAGGIOGTCA CTAGCADCIG GGIGCIQGTA QOOGGAGIOC TIQCAGCTIT 5350 

Q3QCX3CATAC TCCXIPGAQGA CAGGCAGIGT GGICATIOIG GQCAQGAICA 5400 

TOnOICOGG GAAQQCAGCr GIDGnOODG ACAOGGAAGT CTTCIACCAG 5450 

GAGnaS^TC AGATGGAAJSV GIOIOCCICA CAAdTCCIT ACATOGAGCA 5500 

GQGAAT3CAG CTCQOOGAGC AATICAAGCA AAAQGCX3CIC GQUi'iUi'iGC 5550 

AAAGQGCCAC CAAGGAAQ03 GAGQCIOOIG CICQOGIGGT GGAGIOCAAG 5600 

TGGCXBAGOX TTOAGACCIT CIOGGOGAAG CACAIOIQGA ATTICATCAG 5650 

CX3GAATACAG TAOCTAGCAG GCTTATCCAC TCTGCX:TOGA AAOXXDGGGA 5700 
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TAQCMCA3T GAilQQCATIT ACAQCi'iUm TCI^Cm30CC QCICAOCACC 5750 

CAAAACAOCX: TCCIGITmA CATCnOQQG QG?m33GIQG CIGCXXAACT 5800 

QGCKCTDQC AGOQCIQCCT CAQCITIOGr GOGOQCOBGC ArOQaOQS^G 5850 

CXSGCIGITOG CAQCATftOQC CTIQQGAAGG TQCIOSIOGA. CATCTIGOaG 5900 

QQCIAIG3QG CftQQQGim: OSQOGCACIC GimXTTIA. AQC?ICAIGAG 5950 

aaOGGftGGIG aXnXX2^003 AQS^OCIQCT CAACTEACIC OCIQCXJOa: 6000 

TCICTCCIQ3 IGCXXTEGSIC 0103003103 TGIGOSCAQC AATACIQaST 6050 

QQQCAGGIQG QOaDQQGAGA GQOOOCTOIG ACa3QCIGAT 6100 

AQa3n03CT TOQOSOQSm ADCAOGICIC CXXTAOTftC TAIGIQOCIG 6150 

AGAQa3Aa3C TCGAQCAOSr GICACICAGA. TOriCICIAG <XTmX2^ 6200 

ACICAACIQC TCAAGOOQCr (XA0CAi3IQG ATiaMGAQG ACIGCICIAC 6250 

GCCATGCIO: GQCI03IGGC T?^AQQ3A3Gr TIQQGATIOG ATAIGCAC33G 6300 

TCTKS^CIGA. CriCAAS^^ TOQCIOGAGT OiZAAACTOCT GQOOOOOriA 6350 

CCX3QG?^Gia: CrrrcCIGTC AT3Xm03^ GQGT^CAAQG GAGICIGQQG 6400 

GGGGG?03GC A1CA.TGCAAA CC?O:nG0CX: ATSQGGAQGA. CAGATCGCQG 6450 

GACATCICAA AAACXSGTIOC A1GM3GAIDG TAQOGOTrAG AACXIDOCftQC 6500 

AACAOSIQGC ACX9GAACX5IT COGCATCAAC GCAIACACCA. aSOGACCITG 6550 

CACAGCCTOC ODOGClGCXrA ACTATjPGCAG GQCX3CIMQG OQQCTOGCIG 6600 

CIGAGGA3IA CX?IQGAQGTT A03Q3ICT3G GGGATITCCA CTAOOIGADG 6650 

QGCATGACriA CTGACAACCT AAAGTGCCCA TCQCAQGITC QQGCmrGA 6700 

AITUnCAOG GAGGTCGATC GAGIQQQGIT GCACAQCTAC GCTCQOGOGT 6750 

GCAAACCICr TCTACGQGAG GACGTCACCT TCCAGGIDOQG GCICAACCAA 6800 

TACrTGGTCG GC5IOGCAGCT 0CCAT3QGAG CCOGAAODGG ACXJCAACAGT 6850 

GCITACnCC ATGCICACCG ATCCCTCQGA CATIAGAGCA S^GA033CIA 6900 

ASCXJTAGOT' GQCIAGAQGG TUTOCCDCCr Cl'i'iAGQCAG CICA.TCAGCT 6950 

AGCCAGTIGr CTOCXSCmC TITGAAGGQG ACATGCACIA QCTACCATIGA 7000 

CrODOOGGAC QCIGACXIDCA TCGA3QCXAA GCICTIUIQG a3QCAGGAGA. 7050 

ia3GCX3GAAA CATCACTOGC GIGGAGICAG A3AAIAAQ0r AGEAATTCIG 7100 

GACICTTTCG AACCXSCTTCA CQCX33AQQQG GATCAC2ia3G AGAIEATCQGr 7150 

QGCX3GQGGAG ATCCIC3QGAA AATOiMSAA GTICmTICA QQGriGCCCA 7200 

TATCGQCAOS CXXX3GACTA.C AATGCICCAC TGCIAGACTC CIQGAAQGAC 7250 

COSGACIAOG TCCCICOGCT GGrA.CACX3G?^ TSQCXIAnQC CAQCTACCAA 7300 

GGCroCICCA ATACCADCIC CAQQGAGAAA GAQGADGGIT GTOCIGACAG 7350 

AATCCAATGT GIUTTCIOX TTQGQQGAGC TCQCXIACIAA GADCITOOCT 7400 

AGCIDQGGAT CX3ia3GQQC?r TCATAQa3GC Aa3GCX3ACOG CCCTTCCTGA 7450 

CCTSGCCroC GACmCXSGTC ACAAAQGATC CGA.CC?rTGAG TCGTACIOCr 7500 

CXATGCCCXr CXTIGAAQQG GAQCTXSQQQG ACCQOGAICT CAGCXSACQGG 7550 

TCrroGIUTA CXDGIGAGIGA GGAQGCTAGT GAQGAEUIOG TCIQCIGCTC 7600 
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i^TCIOCTAT A03IQGACfiG GCGOXTCAT CMJ30C^W3C GCiaDQGAQG 7650 

AAAGIAAQCr GOXAICAAC CXDSITGAQCA ACIUi'i'iUCT QOIECAQCAC 7700 

AACATOGICr ACXXXS^CAAC ATCXDOGGaOC QC3y!^3CrTCC QQC^ySAAGAA 7750 

aJECAOCriT GACAGATIO: AAGIOCIGGA. IGATCATIAC 03QGACX3IAC 7800 

TCAAQGASn' GaAQQC3S\AG Q03IDCACAG TEAAQGCmA QCTICEATCr 7850 

ATAGAQGAQG CXTIGCAAQCr GAOQQCXXX^^ CATTCQCSOCA AATOCAAATT 7900 

1GQCTAIGQG GCAAAQGAGG TaCX33AAGCT QCXDSmAQC 7950 

ACAIOQQCTC CX?IGIGQGAG GAL'l'iUL'lUG AAGACACIGA. AACAOGAATT 8000 

GACAOCAOCA TCAIQQCAAA AAGIGAGC3IT TIUIQCX3ICC ASOCAGAGAA 8050 

QQGAQGCXI3C AAQOCAGCTC GCTTIAnmr AnOOCaGAC C?IG9GAJ3nC 8100 

GIUIATOCGA. GAAGAIQGOC Ci'i'JJACGAOG T3C3ICIC3CAC QCITCCICAG 8150 

GCGGIGAT33 GCIDCICAIA a3GATnCAA. TACICCXXrA AGCAGOQCCT 8200 

QGAGTTCCIG GTGAATACCr GGAAMCAAA GAAftlGOOCT i^QOGCTICr 8250 

CATA.TGACAC CQGCIUi'i'i'i' GACICAAOaG OXACIGACaG TCACAITOST 8300 

GTHMGACT CAATTIAGCA ATCTIGIGAC TTOOGaacrx; AQGCCAGACA 8350 

GGOCATAA3G TCGCICACAG AGQQQCmA CNI033333T OCX3CTGACIA 8400 

ACICAAAAQG GCAGAACIGC QGrmTC3QC QCT3CXD9a3C: AAGIQQCX3IG 8450 

CTCACmCTA GCTGOGGTAA TAQCmCACA TCITACTim A33CCACIQC 8500 

AGCCIGI03A GCB3CAAA3C TCCAQGACIG CAQGAIGCIC GIGAA03GAG 8550 

ACXSACCrror OGITATCTGr GAAAGCX3CX3G GAACXTAQGA QGAIG0GG03 8600 

GCCCTAOSAG CXTITCACXSGA GGCTATGACT AGGIATIDGG QOOOOOOOGG 8650 

GGATCOBGCC CAAGCAGAAT ACGAXTOC^ GCIGAIAA2A TCATSTICCr 8700 

GCAATGIGIC AGID3QGCAC GATOCATCTC GCAAAAGoCT ATACIACCIC 8750 

ACCXDCTGAa: GCAGCACCCC CCmGCACXSG GCIGQGIQGG AGACAGCTAG 8800 

ACACACICCA ATCAACICrr GGCIAGGCAA TATCATCATC TATOOGGCrA 8850 

GCCTATCGGC AAGGATCATT CIGATGACIC ACi'i'i'i'iCIC CAICCnCIA 8900 

GCICAAGAGC AACmS^AAA AGCCCIGGAT TCTCAGAICT AGGGQGCriG 8950 

CTACIGCATT QAGCCACTTG AQCIADCICA GATCATTGAA OGACTOCATG 9000 

GTCITAQCGC A1TIACACIC CACAGITACT CTGCAQGIGA GATCAATAQG 9050 

GTOGCnCAT GQCICAGGAA ACriGQQGIA CCAQCCTIGC GAACCIQGAG 9100 

ACAT03GGQC AGAAGIUICC GGQCIAAGCr ACIGICQCAG QQQQGGAGGG 9150 

ODGOCACnG TGGCAGATAC CldTIAACr GGGCAGIAAG GADCAAGCTT 9200 

AAACICACIC CAATGCQGGC (DGGGIOXAG CIGGACnUT CIQGCIQGIT 9250 

aJDGGCIGGT TACAGQ3QGG GAGACATATA TCACAGOCIG lUICGIQCCC 9300 

GACXDCCGCIG GTTTCQGriG IGOCIACia: TACTTIUIGT AGGGGTAGQC 9350 

ATTTACCIGC TCOQCAAQOG ATGAAGQGGG AGCTAACCAC TCCAGGCCTT 9400 

AAGCCATnC CIGTITTTTT 'i'i'i'1'l'l'l'i'iT 'I'i'i'i'i'i'i'i'i'i' lUi'i'i'i'il'i'i' 9450 

Ti'lL'i'i'iCCT TTOCrrcnT TTTTGCITTC 'I'i'i'i'iGOCIT CnTAATQGT 9500 
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QQCIOCMXJT TKSOCCmsr CACXSQCmOC TGIGAAAQGfT 003IGAGCXDG 9550 
CATCACIGCA GAGAGIQCIG ATACIQGCCT CICIGCftGAT GATCfT 9595 
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MSTNEKPQRK TKRNI1SIE?RPQ D^/KFPQQQQI VOGVYLZiEE^ GERDSVRAIR 50 

KASEE^EFJG REQPIEFM?R FECS^imQFG YM=!LY3NEG IJ3WAGWLI5P 100 

RSSE^PSWGFT DEE?ERSE?NLJ3 KVTDILTCGF AEOIMELV GAFljQGftM^ 150 

lAHSVRVLED GVNYAIGNLP QCSESIFLLA LLSCLTTPAS 200 

YHVIMXSNS SIVYEAAEVI MHTPQCVPCV QBGNSSRCW iUJTPIIAARN 250 

ASVPTTITRR HVDLLVGrftA PCSAMYVGDL OSSIFL^L FIESPE^RHET 300 

VQDC3^SIYP GHV9C3iRMaW ^yIMMNK^ISPIT ALWSQLLRI PQAWCM^ 350 

AHWGVLAGLA YYSMVOSKMAK VLIVAIIFAG VDGEIHITGR VAOHTSGFT 400 

SLFS9GASQK IQLVNUSGSW HINRTMICN DSU?[GFT3^ LFYAHKFNSS 450 

QCFERMASCR PIDWEAQGWG PPIYIKFNSS DQRPYCWH!£A PE?P03WPAS 500 

Q^QCSVYCFT PSPWVui'iU RSGVPIYSSAJG EISEm/MLSJSI NIRPPQGNS^ 550 

GCIVMS3SIGF TECroOGPPCN IGGVONIE^rLI CPiUCb'KKHP EAIYTE^DSSG 600 

Em/TEFOJVD YPYKLMiYPC TLlvlFSIEKyR MYVQGVEHI?L NAACWTRC^R 650 

RCM£I3RERS ELSPUSLSTT EWQILPCAFT TLPALSIGLI HLfOJIVD^ 700 

YLYGVGSAFV SFAIKWEYIL IjLFLIlAEftR VCACEMMX lAQAEAALEN 750 

LWLNAASVA C^^HSKLV FPCAAWYTKG RIAEGAAYAF YGVWPLLLiLL 800 

lALPPRAYAL DREMAASCX3G AVLVGLVFLT LSPYYKVFLT RLIWWLX^YFI 850 

'TOAEAHMQVW VPPLNVRGGR DAIIIIJICAV HPELIFDITK LLLAZDSPLM 900 

VLQAGriM/P YFVRAQC3LIR ACMJVRKVAG GHYVJ^^IVFMK LGALIGIYVY 950 

NHLTPLRim HAGmDLAVA VEPWFSAME TE<VnWGADr AAGC3DIin3L 1000 

EVSAREO<EI FTjGPADSLEG QO^IRLIAPIT J^SQ^TPO^ QCHTSLTai 1050 

WSTATQSEL ATCHSGVCWr WHGAGSE<TL AGPKGPIT^ 1100 

YINVDIDLVG W^APPGARSM TPCSOGSSDL YLVIRHAEVI PVRRK2)SRG 1150 

SLiLSEE^SY LKGSSQGEUL CPSOiWGVF RAAVCIFGVA KAVDFIPVES 1200 

MEITM^RTF imSTPPAVP QTFQJmLHA ProSC3<SIKV PAAYAAQGYK 1250 

VLVL2vff>SVAA TLJ3FX5AYMSK AHGIDFNIRT CVKTmOGS ITYSIYGKFL 1300 

ADGQCSGGAY DIIICDBCHS TIDSITILGIG TVLDQAEIAG ARLAA7LATAT 1350 

PPGSVTVPHP NIEEIGLSISN GEIPFYa<AI PIEACKQGRH LIPCHSKKKC 1400 

DELAAKLTGL GLNAVAYYERG m/SVIPPIG rVWVATDAL MIGP3X3DFDS 1450 

VIDOTTCVIQ IVDFSIDPIF TIEITIVPQD AVSRSQRK21 TGRC3^IYR 1500 

FVTPGERPSG MFTDSSVLCEE YDAGCMa/YEL TPAEISVRLR AYHsTTPGLEV 1550 

CQmUEFWES VFTCLIHIDA HFLSQIKQAG nSIKPYLVAYQ ATVCAFAQAP 1600 

PPSWDgyiAKC LIRLKPnm FTPLLYRLjGA VgSEVILTHP riKYIMACMS 1650 

ADLEWrSIW VLVGGVLAAL AAYCLTIGSV VIVGRIILSG KPAWPEKEV 1700 

LYQEFDEMEE CASQLPYIEQ O^IABQFKQ KADGLLQTAT KQAEAAAEW 1750 

ESKWRALEIF WAKHTWIFIS GIQYLAGLST LPOIPAIASL MAFTASITSP 1800 

LTT^snXiiFN ID3GWVAAQL APPSAASAFV GAGIAGAAVG SIGLGEWLVD 1850 

ILAGYGAGVA GALVAEE<X/MS GEVPSIEDLV NLiPAILSPG ALWGWCAA 1900 
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Hi^RfMSPGE Gm^mmJi AEASFOQHVS PIHWPESm AARVIQILSS 1950 

LTTIQLLKRL HQWINE3XSr PCSGSWLREV W0WICIVL1D FKIWL^^SKLL 2000 

PRLPGVPFLS CQEGYK3VWR GDGII^JETCP OS^QIAGHVK N3SMRIVGER 2050 

TCSNIWHOT Png^OTIGPC TPSPAH^SR AIJWF[VAAEEY VEVIFWOTH 2100 

YSmMTHMJ KCPCQVPAPE FFIEVDGVFOli HRYAPACKEL LRECVTFtQVG 2150 

L2S3QYLVGSQL PCEFEPCVIV LTSMLIDPSH n?^EIAKRFiL ARSSPPSLAS 2200 

SSASQLSAPS Ii<ATCnHEiD SHDAnr.TF?^ LUWRQEtCaST ITRVESE3SIKV 2250 

VILDSFEFIH MGDEFELS^ AAKHFiKSRK FPSALPIWAR EmNPELLiES 2300 

WKDEDWPiV VB3CPLPPIK APPIPPEKRK RIWLTESNV' SSMAELAIK 2350 

TFGSSGSSAV DSCJCATALPD lASUXXKSS EWESYSSyiPP LEEEPGDPDL 2400 

SDGSWSIVSE EASECWCCS MSYTWIGALI TPCAAKRSKL PINELSNSLL 2450 

FHHNMVYATT SRSASLRQKK VTBTBU^JLD rmREVLKEM KAKASIVKAK 2500 

liLS TK FACKL TPEHSAKSKF GYGAKEVFNL SSRAVNEURS VWEDTT.EnTE 2550 

TPlUi'i'lMAK SEVPCVQPEK CGBKPPSKLJV FPDLGs/EVCE KMALYEWST 2500 

LPQAVECSSY GPQYSPKQEV EFLWIWE^ KCHGFSYDT PCFDSIVTES 2650 

DIEVEESIYQ CCDLAEEARQ AIRSLTERLY IQGEUINSKG ^J^DGYFRCRA 2700 

SGVLTTSCGN TLTCYIi^TA ACRAAKD2DC IMLVNXDLV VICESAGIQE 2750 

DAAALRAFIE AMIRYSAPPG DPI^PEYDLE LITSCSSNVS VAHDASCKRV 2800 

YYLTRDFITP IARAMaETAR HTPINSWDGN IIMCCTm RMIIMIHFFS 2850 

HIAQBQLEK ALDCQIYGAC YSIEPLDLPQ IIERIH5LSA FILHSYSPGE 2900 

HSIRVASCLRK D3VPPLKIWR HRARSVRAKL LSQQGRAATC GEmjmmJR 2950 

TKIi<LTPIPA ASQLDLSGWF VAGYSGGDIY HSLSRARPRW FPLCLLLLSV 3000 

GVGIYIiiFNR 3010 
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#2. Strategy for constructing chimeric clone of HCV (pH77CV-J4) 
whicli contains the nonstructural region of strain H77 and the 
structural region of strain HC-J4 



1 1' 1 




A C+E1+E2+P7 


>4US2 + NS3 + NS4B + NS5A + NS5B 


AYA 





/lsrel(156) C/al (710) 

GCA TAG GCA 
•CA TAT G 



PCf? products 



Nde I • (2763) 



Eco 47 III (blunt end; 285 1 ) Hind III >t// II 

(7862) (9160) (9403) 
gc C ATA ' T Gt 
gcr ATA TGt 

H7851S ► 



>Cla\Nde\EcoA7 III 
X A 



Fusion PCR 



H9173fl(P-M) 



' B 



H9140S(P-M) - 



■ c 

- H9417R 



H2751S{Clai/Ndel) H2870R 
Xho I (2282) 



pCV-J4L6S ■ 



Nde 1*(2763) 
D 



J4-2776R(Ndei) 



1. Fragment A, B, C and D ; PCR amplification from pCV-H77C or pCV-J4L6S 

• Fragment A ; additional Cla I site, artificial Nde I site induced by a single mutation 
(C-^T at nt 2765 of H77C) and authentic EccAl III site 

• Fragment B and C ; eliminated Nde I site by a single mutation within the primers 
(C-4T at nt 9158 of H77C) , and fusion PCR with both fragments 

• Fragment D ; artificial Nde \ site induced by 2 point mutations within the primer 
(T-^A at nt 2752 and C^T at nt 2765 of J4L6S) 

2. TA cloning of PCR products 

3. Sequence analysis 

4. Cloning of Fragment A {Cla l-Eco 47III } and Fragment B/C (H/ndlll-4f/II ) with correct 
sequence into pCV-H77C 

5. Complete sequence analysis of new cassette vector [ pH77CV1 . into which the structural 
regions of different genotypes can be inserted. 

6. Cloning of Fragment-Age UXho I (cut out from pCV-J4L6S) and Fragment D {Xho \-Nde I) 
with correct sequence into the new cassette vector ; 3 piece ligation 

7. Complete sequence analysis of 1a+1b chimera [ pH77CV-J41 

8. In vitro transcription (within 24 hours of inoculation) 

9. Percutaneous intrahepatic transfection into chimpanzee 

FIG. 15 



pH77CV-J4 Sequence 



GCCAGOGOO: 1GAIQQQQQC GACACICCAC CATGAATCAC TOaOCIGTGA 50 

QGAACTACIG lUTICADQCA GAAAGOGICT AGGCAIQQCXS TISGEAIGAG 100 

TGKxsiaa^ ccj£cpa^ iso 

OC3GAAa333r GAGTACACOG GAATIQOCAG GAOGAOCQQG T Ol'l'l'lLTiG 200 

GAICAAOCXG CICAA^GQCT GGAGATnOG QQGIQQGaOC GOGAGACIGC 250 

TAGOOGAGIA GiUi'iUQGIC GCGAAAOXX: TIGIGGTACT QOCTGAnaOG 300 

GIGCnOOGA GIGOOOOQGG AQGiCiULJlA GAOOGIQCAC CATCfiOCS^ 350 

AATCCIAAAC CTCAAAGAAA AAOCAAACGT AACAO^iAa: QQCGOOCftCA 400 

GGA03TCAAG TIOOOGOGOG GIGGTCAGag C UI'IUJIUG A GnTAOCIGT 450 

T3Ca303CAG GOGOOQCAGG TIQQGTCIGC QOGOGACIAG GAAGQCnDC 500 

GAGaaGTOGC AAOCTCUTOG AAGQOSACAA CXTTAIOaCAA AGQCTQaaOG 550 

ACOOGAQQGC AGGQOCTGOG CICAGOCTQG GTACOCnOG CX^OCTCTAIG 600 

GCAAIGAQ3G CCIGGGGIGG GCAQGAIGQC TCCIGICACC OOGOOQCIO: 650 

CX3GGCTAGIT GGQQODCXIiAC GGACOOCrOG aSIAGGICQC GTAACnOOG 700 

TAAQGICAIC GATAODCrrA CATOCX3GCIT OQOGGATCIC AIQGGGTACA 750 

TIDOGCIQGT OSGCQQCOOC CIAQQQQQGG CIGQCAQOGC CITOQCACAC 800 

GGIUICCX3QG TIUIGGAGGA CGGOGIGAM: IAIGCAACAG GGAACITOCX: 850 

QQGTTQCICr TrCTCTAICT TOCICITGQC TCIGCIGIOC 'iUi'i'iGAOZA 900 

TGOCAGCITC aSCTEATCAA GTOOSCAADG TUTOCXSOGAT ATADCAIGIC 950 

ACGAAOGACr GCTOCAACTC AAQCATIGIG TAIGAOSCAG 03GAGGIGAT 1000 

CAlGCATACr OCOGQGIGQG TGQCCIGTOT TCAQGAQGGr AACAGCTCOC 1050 

GnOCIGGGT AGCQCICACr CCCAOGCTOG OSQCCAQGAA TQOCAQOGIC 1100 

CCCACIAiDGA CAAIAD3ADG CCADGTEDGAC TiUL'iUGi'iG GGAOSGCTQC 1150 

TnCIGCICC GCTAIGIAOG TGGQOGATCr CIGOGGAICT ATirTOCIOG 1200 

TCTOOCAGCr GTIGACCriC TOGCCTOQCC GOOSOGAGAC AGIGCAGGAC 1250 

TGCAACIGCr CAAICTAIDC CQQCCAIGm TCSGGICACC GCAIGQCITG 1300 

GGAIAIGAIG AIGAACIOGT CACCTACAAC ^GOOCTAGIG GIGIXJGCAGT 1350 

TGCIOQQGAT OCCACAAQCT GICGTGGACA TQGTEGQOGQG QQCCrACIQG 1400 

GGAGIXCTGG a3QQaCriGC ClPCmrVCC A3?33rAa3GA ACroOGCTAA 1450 

GGnCIGATT GIGGOGCTAC TCTTIGCDGG UGi'iGAOGQG GAGACCXACA 1500 

CGAQSGOGAG QGIGaCOQaC CACACCADCT CCXSGGnCAC GIDOJi'i'i'iU 1550 

TCATCIQQ3G CGICICAGAA AATCCAGCIT GIGAAEAOCA ACGQCAOirrG 1600 

GCACAICAAC AQGACIGQOC TAAATIQCAA TGACIDQCIC CAAACIGOGT 1650 

TCmOOGGC GCrcrnTAC GCACACAAGT TCAACTOSIC QQQGIQQCOG 1700 

GAGOGCATQG QCAGCIGCOG OCCCATIGAC TOGTiaSQCC ^333310003 1750 

QOXAICACC TATACTAAGC CTAACAGCIC QGAICAGAGG OCnAITGCr 1800 
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pH77CV-J4 Sequence 



QQCATEACX3C GOCIGGAQOG IGIQOIGIDG TAOCXDQQGTC QCAQGIGIGT 1850 

QGIQCAGIGT Ai'iUi'i'iCAC CQCAAQCOCT GTIGIGGIGG Q3^£X3€CCA 1900 

TOGnOOOGT GIQCCIADGr ATAQCiaOQG QGAGAMX3AG ACAGADGTEGA. 1950 

TCCIOCTCAA. CAACAOOOGT COGOCACAAG QCAACIQGIT a3QCIGEACA 2000 

TOGATCAMA GTACiaOGn CACIAAGADG T3a3GAaJCC CXDaCGICTAA 2050 

CA.10G3GGQG GIOaSTAAOC QCACCnGAT CIODOQCACG GACiUL'i'iUC 2100 

GGAAGCAOX: OGAGQCIACr TACACAAAAT GIQQCIOGGG QQQL'iUUi'iG 2150 

ACAOCTAQCT QCCEAGnZAGV CEAGGCAIAC AQOLUTiUGC ACIACCXDCIG 2200 

CACICICAAT TTTIOCAICr TEAAQGTCTAG GA3GEAIGIG GOQaOOGIGG 2250 

AGCACAQGCr CAAIQCXX9CA TQCAAnOGA CIOC^GGAGA GOSCIGEAAC 2300 

TIQGAQGACA GGGAIAGGfIC AGAACICSGC CXX3CIGCIGC TCICEACAAC 2350 

AGACT33GAG AIACIQCacr (JiULTi'iUAC CAOCCTAODS GCnTAiaCA 2400 

CTOJi'i'iGAT CCA.TCICCAT CAGAACA30G 'IQG?03TCCA. iVE?OnGTAC 2450 

GGflCTAQOGT CAQCXJCTIGr CIOL'i'i'iUCA AICAAAOmS AGrACATOCT 2500 

GTIGCmTC CITCTCCIQG CAGAD30G0G GGICTGFIGCC TGCi'iUiQGA 2550 

TGATCCIGCT GATAGCCCAG GCIGAQGQ03 OCTEAGAGAA CrroGfrOGTC 2600 

CICAAIGOGG OGTOCGIQGC OQGAGOGCAT GJmnCVCT GCTTICriGr 2650 

GITUriUIQC GGXDGGCIQGrr ACATEAAGGG CAGGCIGGCT aCIQQGQOGG 2700 

CX3rE7^IGCITr TTAIGGCXJCA. TGGCOGCIQC TCCIQCTCCT ACIGQCCTIA 2750 

CCI^£CPn3^ CATAIGCACT GGACAQ3GAG GTGGCXDGOCT CX^IGIQGCJQG 2800 

CXii'iUi'iL'iT GTQQQSmA TQQOGCIGAC TCTGIOGCJCA TATIACAAGC 2850 

GCn5.TAICAG CIGGIQCATC TGGIQGCITC ACTALL'i'i'iCT GACCAGAGIA. 2900 

GAAG03CAAC TCCACXHGIG GGnQOOCXDC CICAAQGTCC GQ33993G0G 2950 

CGAIGCaHC AICTTACICA TXHUIGTAGT ACACCOGADC CIG3IATITG 3000 

ACATCACCAA ACTACTCCIG GCXimUi'iUi GACQOCITIG GATTdTCAA 3050 

QCCAUi'i'iGC TTAAAGICXX: CIAUi'iLUiU CXSOSTICAAG aDCITCTOOG 3100 

GAICIGOGCG CIAGQQOQGA AGAI?^3QQGG N33ICNrm2 GIQCAAATOG 3150 

OCATCAICAA GnAGGOGOG CTTACIGGCA. QCimUlUiA TAACCATCIC 3200 

ACGCCICnC GAGACIGGQC GCACAAa3GC CIQOGAGATC TQQ003IQGC 3250 

TGTQGAACCA. GiaJiUi'iL'i' COOGAAIQGA GAOCAAGCTC AICACXIP3QG 3300 

GGQCAGAIAC 0GCCX30GIQC GGIGACAICA TCAAaaOCIT QOOOGIUICT 3350 

QOQOCTAGQG QGCAGGAGAT ACTGCTTOGG QCAODOGADG GAAIGGICIC 3400 

CAAQGGGTOG AGGITGCIGG CGOXATCAC GQOGrnAOQCX: CAGCAGAOGA 3450 

GAQGCXnOCr AQQGIGrAIA. ATCACCAGQC TGACTGQOGG QGACAAAAAC 3500 

CAAGIGGAQG GTGAQGIDCA GAIOJDGICA. ACIQCTACQC AAACCTTOCT 3550 

GGCAAD3TCC ATCAATOQQG TATQCIGGAC TUICIACCAC GQ3GCX33GAA 3600 



FIG. I6B 



pH77CV-J4 Sequence 



OGMGACCAT OGCATCAOCC AAQGSIOCIG TCATCCAGAT GIAI?£X::AAT 3650 

GTGGAQCAAG ACCi'iUiGQG CIGQOGOGCT OCICA^^GCTT CX^DQCICATT 3700 

GACACQCIGT A0CIQ03QCr CCIOSGACCr 'mOCTOCTC AOGMQCACG 3750 

CXX3ATGICAT TCajJD303C CQQGG^iGGIG ATPGO^GOOG TAGOCIQCIT 3800 

TCQGCXXOOC OCa U'l ' lOJlA CITGAAPOQC TCCroQGQQG GflOaQC'iUi'i' 3850 

GiaCXXXDOOG QGACMQCaG TQQQaCEATT GAQGOaaaOG GIGIQC2£)CC 3900 

GIQGAGIOSC TAAAGDQGIG GACITIATOC: CIGTGGAGAA CXTEAQOGACA 3950 

j^CCATCAGAT 0CXXO3IGIT CAa3GACAAC TCCTCTOCAC CAGCAGIQOC 4000 

CCAGAGCriC CAQGIGOaOC ACCIQCAIQC ICCCACOOQC AQOOGIAAGA. 4050 

Q0Nx:m:3:ir coaoQciaDG TAOOCsmi: ^mxTACAA qgigitiqgig 4100 

CrCAACCXCT CTCTIQCnQC AACXSCTOGQC TnGSIQCIT ACAIGIOCAA 4150 

QGCGCATOQG GITGAIOIJIA ATATCAGGAC OQQGGIGAGA. ACAATTAOIA 4200 

CIQQCAQCOC CATCAGGTAC TCCACCIA03 GCAAGnOCT TOOOGAOSGC 4250 

QQGIGCICAG GAQGIGCrm T3ACATAA3A ATITGIGADG ACTGQCACIC 4300 

GACX3GA3I3GC ACMGCmcr TGQQGATOQG CACTSIOITr GAQCA;^i3CAG 4350 

AGACIQCQQG GGOGAGACIG GTICTGCIOG CCNJTQOmZ aCTG03Q3C 4400 

T3DGICACIG ICTGCXima: TAACA3?0G?C G^^GGTIQCTC TGIOCADCAC 4450 

QOGAGAGAIC CXCITTEACG GGAAGGCTAT COOOOIOGf^ GIGATCAAGG 4500 

QGGGAAGAC?^ TCICAICTIC TOCCNJTCPA i^GAAGAAGIG OGAOGAGCIC 4550 

GCGGOGAAQC IGGTCQGATT GQGCAICAAT GQCGIOGOCT ACIACGG03G 4600 

TCTTCADGIG TCIGICATCC CGA(XAQ03G QGAIGTIGIC GTGGIGIDGA 4650 

CCGAIQCICr CAIGACIGGC TTTACG33QG ACITGGACTC TCIGAIAGAC 4700 

TGCAACADGn? GIGICACTCA. GAC?^GI03AT TICAGGCTIG ACCCTACCIT 4750 

TA.CCATIGAG ACAACCAOGC TCGCCCAGGA TQCIGICIDC AGGACIGAAC 4800 

GCa3QGQCAG GACIGQCAGG GQGAAGCXiS^ GGAICEAIAG Ai'i'iUiGGCA 4850 

CCGGQ3GAGC GCCXXTCOGG CATCITCGPC TO3IDCXJDGC TCIGIGAGIG 4900 

CEAIGACX303 QGCIGIGCTT GGrAIG?07r CAO3CXDCX90C GAGACI3^CAG 4950 

TIAGQCEACG AGCGTACAIG AACAOOGOGG GGCnODOGr GIGCCZ^GGAC 5000 

CAICriGAAT TnQGG?033 a3IUITE?OG QGGCICACIC ATAmSAIGC 5050 

CCACTTTITA TCCCAGACAA AQa^GAGIOG QGAGAACTIT GC™CCIQG 5100 

TAGGGIAiXA AGGCACGGIG TQOGCIJm^ CICAAGOGCi: TOGQOCATOG 5150 

TGQGACCAGA. TGIQGAAGIG TIT3Vim3C CTTAAAOrA OQCIOCAIGG 5200 

GCCAACAOCC CIGCTATACA GACIGGGOGC IGITC^GAAT GAAGICAQQC 5250 

IX^PaX^-aJZ AATCACCAAA TACAICATCA CATOCATCIC QG003ACCIG 5300 

GAGGIOGICA OGAQCACCIG QGlGCiUGi'i' G30G3GGIOC TOGCIGCICT 5350 

GGCaaCGTAT T3CCTGICAA CAGQCTQCGT GGTCAISCTG GGCAGGAIOG 5400 

FIG. I6C 



pH77CV-J4 Sequence 



TCTIGiaDGG GAAQCCX3QCA. ATEAIACCIG ACAGGGAGST TCCCEADCAG 5450 

GAGTIGGAIG AGAIQGAAGA GIQCICICAG CACrEACOGr ACATCGAQCA 5500 

AGGGAIGAIG CTOGCIGAGC AGirCAAGOV GAAGGCXXTTC OQCCICC'iGC 5550 

AGADOGGGIC CGQGCAIGCA GAGGITATCA CXXXIQCIGT GCAGACCAAC 5600 

TOGCAGAAAC TI]GAG3ICrr TIQGQ03AAG CACATOIQGA ATITCATCAG 5650 

TGGGAIACAA TACnQGOGG QOCIGICAAC GCIGOCIGGT AACOOGQCrA 5700 

TIGCITCATT GAIQGCmr ACAQCIGGOG TCADCAGCCC ACTAAGCACT 5750 

GGCTAAACaC TOJiCi'iCAA CATAnGGQG 0303X303100 CIQOaCAQCT 5800 

ooGOoaoaQC OGiQaaaciA cioocrriGr oooiociooc ciaqciogog 5850 

aDQOCATCQG CAQCXSTIOGA CIQQQGAAQG TOCIDOIGGA CAl'iCi'iUCA. 5900 

GGGTAIGOOG GQQQOGIQGC QG3?^GCTCIT OmSCATICA AGATCAOma 5950 

OGGIGAGGIC COCBXAOGG AGGAOCIQGT CAAOTCTOCIG OOOQOGAIQC 6000 

TCTCQCCIOG AGOCXJi'iUiA GTOGGIGIQG TCIOCGCAGC AATACIQOGC 6050 

QOaCAOGnG GCCQOOaOGA GGGGGO^GIG CAAIG3ATCA AOOGGCmAT 6100 

AGCXnCGOC TQOGQQGQGA ACCALLUi'i'iC COCCAOGCJC TAiDGIGCGGG 6150 

AGAGQGAIGC AGCQGCODoC GIGACIGOCA TACICAQCAG GCICACIGIA 6200 

ACCCAGCroC TGAQGCGACr GCAICAGIGG ATAAQCTOGG AGTGIACCAC 6250 

TCCAIQCICC QGITCCIOGC TAAGQGACAT CIQGGACIQG ATAIQOGAGG 6300 

TCCIGAGQGA dTIAAGACC TGGCIGAAAG OCAAGCICAT QQCACAACTG 6350 

QCIGQGATIC ariTIGTGIC CIQCCAGQQC GGGTATAQOG GGGICIGGOG 6400 

AGGAGACGQC ATEAIGCACA CIOSCIGCXiA CIOIOS^GCT GAGATCACIG 6450 

GACATCICAA AAAOSQGACG ATGAGGATCG TGGGIOCIAG GACCIQCAQ3 6500 

AACAIGIGGA GIGGGAOGIT CCCCNTm^Z GOCIACACCA CQOGaQCCIG 6550 

TACICCXXTr CCIGOQCCXSA ACmTA^CTT CDGOGCIOIGG AGQGIGICIG 6600 

CAGAQGAAIA GGIQGAGAIA AGGOGGGIGG GGGACTIGCA CEAOGilAliUG 6650 

GGIAIGACTA CIGACAATCT TAAAIGCOOG TQQCAGAIOC CATCGQCGGA 6700 

AnrnCACA GAATIQGAOG GGGIQOGOCT ACAC^iGGnT GOQCXXXXTT 6750 

GCAAGCCCTT GCIGGQGGAG GAQOIATCAT TCAGAGIAGG ACIDCAGGAG 6800 

TACCQGGIQG GGTOGCAATT ACCnOOGSG QOGGAACCQG AOGEAQOOGT 6850 

GITGAOOTOC AIGCICACIG ATCCCIGOGA TATAAC^^GCA G?^G30aG003 6900 

QGAGAAGGIT GQQGAGAQQG TCACOOGCIT CI!ArQ3CC?fi CI0CI03GCT 6950 

AGCCAGCIOr O03CIOCA.1C TCICAAGGCA ACTTOCACOG OIAAOCAIGA 7000 

CIGCOCIGAC QCOGAQCICA TAGAQQCriAA CCIDCIOIGG AGGO^OGftGA 7050 

TQGGOGQCAA CATCACCAQG GITGAGICAG AiSAACAAJ^ GGIGATICIG 7100 

GACIGCnOG ATGQGCTTGT GGCAGAGG?^ GAIGAGQQGG AGGIUIGOGT 7150 

ACCIGCAGAA ATTUIG03GA AGICIDGG^ ATTCGOOCDG GOQCIGCXDCXS 7200 



FIG. I6D 



pH77CV-J4 Sequence 



TCIQC3GCX3aG GCOQGACIAC AACCXXXD03C TAGEAGAGAC GIQGAAAAAG 7250 

QCTGACIAOG AADCACCIGT QC3IOCAIQ3C 1QCm3CIAC CADCTOCAOG 7300 

GiooCTGcr Giaxnmx: ctoqgaaaaa QaG?iAa3GfiG GicxiccAcaG 7350 

AAlCSACCCr J^aCTACIQCC TTG3CXDGAGC TIQOCAGCAA AAUi'i'i'iUaC 7400 

AGCICCICSA CnOOGQCftT TS£X3QQ0GAC CATCClCiGA. 7450 

QOCXDGCCCCr TCIQQCIQOC CCXXJOGACIC OGftOGfTIGAG TXXnm'iLTi' 7500 

OCAIQOaCJCC GCIQ3AGQ3G Q^GOCiaOGG ATOOQGATCT CAQ0GA0333 7550 

TCAIGSTOGA. OaGICAGIAG TQQQQaCXSAC AGQGA^OmS TCGIGIQCTG 7600 

CICAAIGICr TATI0CT3GA CAQOOaCACr Ca^ICADOOCG TOQQCIQOaG 7650 

AAGAACAAAA ACIGOOCATC AAGQCACIGA. QCAACICGIT QCEACQOCAT 7700 

CACAATCIQG I GIA T IOCAC CftCTICACX3C i^iUL'i'iUCC AAAQQCAGAA 7750 

GAAAGICACA TTIGACAGAC TGCAAGTIUT GGACAQOCAT TT^CCAQGAOG 7800 

TGCICAAQGA QGICAAAGOV GCQGOGICAA AAGTGAAQQC TAAL'i'iUCiA 7850 

TCOGflAGAQG AAQCTIGCAG CCIGACQOOC OCACATICAG CCAAftlCCAA. 7900 

GrnOQCCAT QQQQCAAAAG ACGIOCTIG OCAIQGCAGA A^^GGOGGTEAG 7950 

CCCP£ATC^^ CVCCGIGIGG AAAG?iCCnC TQGA^^G?£S^ TGTAACACCA 8000 

ATAGACACm CCAICAIGQC CAAGAAOG^ Gi'i'i'iL'iUOG TICAQCXriGA 8050 

GAAGQQ33GT QGIAAQCXZAG CTOJECICAT OJiUi'iOQCX: GACXnQ33CX3 8100 

T3CX30GTCIG OGAGAAGMG GGCCIGTAQG AOJiUJi'llAG CAAGCICOQC 8150 

CIQQODSIG?^ T3G3AAQCIC CnmSATIC CAAIACICAC CAGGACAGOG 8200 

GGTTGAATIt: CTQGIGCAAG CGIGGAAGIC CAAGAAGACC CGGAIGQGGT 8250 

TCTGGIAIGA TACCXGCICT TITCACIOCA CAGICACIGA GAGOGACAIC 8300 

CGTCACQGAQG AGGCAATTm CCAAIUi'iUi' GACCIGGACC OGCAAGGQOG 8350 

GGnDGGCCATC AAGTCCCICA CIGAGAGGCT TTAIGTIGGG GQCCL'iUi'iA 8400 

CCAATTCAAG QGGGGAAAAC TGCX3GCISOC GCAGGIQOOG CQGG?OaQ3C 8450 

GTACIGACAA CTAGCIUIQG TAACACQCIC iOTIQCrACA TCAAG3QC03 8500 

G3CAQCCIGr OGAQCDGCZ^ QGCIOCAGGA. CIQCACCATC CIOCIEGIGIG 8550 
GCGAQGACrr AGIOGITATC TCIGAAAGIG OG33Q3130CA. GGSQGJOGOG 8600 
GGGAGCCIGA. GAQCCITCAC QGAQGCTAIG AQ^GGTACr CXDGaCOOCCC 8650 
OGQGG?mX CCACAACCAG AJYTAOGACTT GGAGCITAIA ACAICPJacr 8700 
CCICCAAOGT GICAGTOGCC CACGAD3G0G CD3GAAAGAG GCJECTACTAC 8750 
CrmCCQGIG AOCCTACAAC CCGOCTGGOG AGAGOOaOGT QGGAGACAGC 8800 
AAGACACACT OCAGTCAATT CCIGQCT?^ CAACAIAATC ALLUiTiuOQC 8850 
QCACACIGIG GQQGAOGAIG ATACIGAIGA. GCCALL'i'iUi'r TAGCGTOCIC 8900 
ATAQQCAQGG ATCAGCTTGA ACAQQCICIT AACIGTGAGA. TCTACQGMC 8950 
CltXTACTOC ATAGAAOCAC TOGAIUIACC TOCAATCATr CAAAGACICC 9000 

FIG. I6E 
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i\[[GGCXriCAG 03CATITICA CICCACAGIT 1GAAATCAAT 9050 

AQQGia3CCG CNJX30CICN2 AAAACITOQG GICGaaQOCT TQOGAGCnG 9100 

GAGACAODGG QOOOGGAGOG TCOaOQCIAG QL'i'iCIGICC AGAQGAGQCA 9150 

GQQCIQCIAT AIUiaQCAAG TAOCTCITCA. ?CJX333C?Gr AftGAMA?^ 9200 

CICAAACICA dOCAAIAQC 000000000 OOQCIOOACr TOrOOOGTIG 9250 

GTICADOQCr QOCIM»GOG QOOGACaCAr TIATCACAGC GIGICICAIG 9300 

0000000000 ciGoncroo TmoociAC lociociaQC tgcaqoooia 9350 

QGCATCTAOC 10CIOOOCAA. OOGAIGAAQG TIGOOOrAAA CACIOOOQOC 9400 

lUTEAAQOCA ' i ' i ' iujiurrr ' i'i'i'i'i'i'i'ri'i ' TmrnnT Trrncrnr 9450 

' I ' i ' i'i ' i ' i'iuiT TCCiTTCcrr cmrmoc TrirmTic orricrmA 9500 

IQGIOOCIOC ATCITAGOOC TAGICACQ3C T?^GOIGIGAA. AGGTOOOIGA. 9550 

QCOrMGAC IGCAGAGAGT GCIGAIACIG QOOICICIGC AGAICAIGT 9599 



FIG. I6F 
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10 20 30 40 50 

17.^4567890 19.-^4567890 123456789Q 1234567890 1234567890 

MSINEKPQRK TKF!NIM?RPQ DVKFPQQQQI VGJJYLLPPR GPEIDSVRMR 50 

KASEE^SQPRG RRQPIEE^SJRR msmPQ^ YPWEOOIEE UmOMISP 100 

PGSRPSWGPT DPRPRSENLG RVroiLTOGF PUMSHFUJ GAFLj3GAARA 150 

LAH3VRVIS) GMmin<KJP QCSFSIFIIA LiLSCLTIPAS AZEVF3SIVSGI 200 

"liKVTMXSNS SIVYEAAEVI lyiHTPGCVPCV QH3SISSRCW ALTPITAARN 250 

ASVPimPR HVDLiLVGim FCSftMYVGDL OGSIFLVSQL FIFSEKE?EiE3T 300 

VQDOSJCSIYP GHVSGHRM?iW I3X!iyMSIWSPIT ALWSQIXi?! PQSVVEM^TZ^ 350 

AHWGVLAGLA. YYSMVGNWMC VKEVMiLEAG VEGEIHTHSl VACOTISC^ 400 

SLFSSGASQK IQLVNUSIGSW HIlSKmLIOJ DSLQIGFFAA IF^ffiHKENSS 450 

QCPERMASCR PIDWEAQ3WG PITYIKHSISS DQRPYCWBKA PE^POGWPAS 500 

Q\raFVYCFr PSPWVGTID RSGVPIYSWG ENEITMyiLLN NIRPPQ3SJAIF 550 

GCTaIMNSIGF TKIOSGPPCN IQGVCa^lRrLI CPnrFRKHP EATYTKCGSG 600 

FWLTPRCLVD YPYKLMIYPC TUSIFSIFKyR MYVQGVEEiRL NAACNWIRGE 650 

RCNLEDRDRS ELSPLIiLSTr EWQUJCAFT TLPAL^IGLI HLHg^mOTQ 700 

YLYGVGSAFV SFAIKWEYIL IXPIaLLADAR VCACLMyiMLL lAQAEAALEN 750 

LWmAASVA GAHGILSFLV FFCAMmKG RI::APGAA!£AF lO^WPTJJJ.T. 800 
lALPPRAY?^ DmTAASGQG WLVGO-IALT LSPYYKRYIS VOMaHLQ™^ 850 
TE^VEAQLHVW VPPLNVRGGR DZWHIMIW HFTLVEDITK lUAIFGPm 900 
ID2ASm<VP YFVRVQSLIiR ICALARKIAG GHYVg^IK DGALIGIYVY 950 

NHLTPLEim HNGLRDLAVA VEPWFSRME TKLZEWSOT AACXSDIIN3L 1000 

PVSARRGQEI DjGPMXM/S KO^eiiAPIT AY?CT3lGLiL GCUTSLT3R 1050 

DKNQVEGEVQ T^STATQTFL ATCINGVCWr VYHSAGIRTI ASEKGP^Q^ 1100 

YTM/DQDLVG WPAPQ3SRSL TPCICGSSDL YLVIRHAEVI PVRRRGDSFG 1150 

SIIjSPRPISY LKGSSGGPIi CPAOiAVGLF RAAVCIRGVA KAVDFIPVEN 1200 

niriMRSPVF TnsrSSPPAVP QSFQl^AHIliA FIGSGE<SrKV PAAYAAQGYK 1250 

VLVLNPSVAA TDSPGAYMSK AH^VDENTRT GVRTITIGSP riYSTYC3<FL 1300 

ADQQCSQG?^ DIIICDBCHS TDATSIDSIG TVLDQAETAG ARLWIATKT 1350 

PEGSVIVSHP NIEEyALCTT GEIPFYGKAI PLEyiKQGRH LIPCHSKKKC 1400 

DELAAKLVAL GimVAYYPG m/SVIPTSG ETvAAA/STDAL MIGPiaDFDS 1450 

VIDCmiCVIQ T\7DFSLDPrF TIEITILPQD AVSRIQRRai TQOOPGIYR 1500 

FVAPGERPSG MTDSSVDCBC YDAGC?!Ja7YEL TPAETIVRLR AayiNTPCSLEV 1550 

CQIHLEFWG VFTGLTHEEA HELSQIKQSG ENFPYLVAYQ A1VCARAQAP 1600 

PPStAnDQ^MCC LIELKFTLBS PTPLLYRDGA V^^EVTLIHP ITKmECMS 1650 

ADLEWTSIW VLVQGVLAAL AAYCLSTOCV VIVGEm/LSG KPAIIPCiREV 1700 

LYQEFDEMEE CSQHLPYIEQ OyiyiLAEQFKQ KADCSLD^AS FHAEVTTPAV 1750 

QIMvQKm/F WAKHWEFIS GIQYLAGLST LPOMPAIASL MZ^FZAAVTSP 1800 

LTTCQIIiLFN IDGGWVAAQL AAPGAATAEV GAGLAGAAIG SVGDGKyLVD 1850 
ILxAGYGAGVA GALVAFKIMS GEVPSTEDLV NLLPAILSPG ALWGWCAA 1900 
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10 20 30 40 50 

1234567890 1234567890 1234567890 1234567890 1234567890 

ILE^RHVGPGE G?VVg^^yiISIRLI AE?\SRaSIHVS PIHYVPESDA. AARVIATTtSS 1950 

LTVigilEPL H2WISSECrT PC9GSWLRDI WDWICEVLSD EEOWLE^AKLM 2000 

PQLPGIPFVS CQPGiPGJm (SXSimnUl OGAErrOiVK HICMRIVGER 2050 

TCRNMaJSGIF PUSCOTIGPC TPLPAH^YKF ADWRVSAEEY VKEKEM3DBH 2100 

WSGEOTIMj KCPCQIPSPE EPIELDGVRL HRFAPPCKEL LEIEEVSFFWG 2150 

LHEYPVGSQL PCEEEECWAV LTSML1DPSH ITAEAAGRRL ARGSPPSM?^ 2200 

SSASQLSAPS liO^TCI^NHD SPCftELIEAN LLWRQEM3C3Sr ZEFVESHsIKV 2250 

VTLDSFDELV AEEDEE^Ev^ EAEILE^KSRR FAPALEVW?^ ECJmPELVEr 2300 

WKKEDYEPEV VH3CELPPER SPPVPPERKK RIWUIESTL SmLAELAIK 2350 

SPGSSSTSGI T ULNi'i'i SSE PAPSGCPEDS DTESYSSXIPP m3EPC3)EnL 2400 

SDGSWSIVSS GADTEHWCC SMSYSWIGAL VTPCAAEEQK LPI3SIALSNSL 2450 

LRHHNLWSr TSRSACQFQK KVTFLFljQVL DSHYQDME VKAAASKVKA 2500 

NLJjSVEEACS LTPHiSAKSK PGYGAm/RC HSJRKVVZmN SVWE^LLEDS 2550 

vrpmrrim KNEypcvQPE kqqrkparli vfpddsvevc ekmalyewvs 2600 

KLPIAVIVGSS YGPQYSPGQR VEBIjVQMaKS KKTEM3FSYD TFCFDSIVTE 2650 

SDIRIEEAIY QCCDIDPQAR VAIKSLTEFL YVQGPCINSR GEN2GYPRCR 2700 

ASGVLTTSCG NTLTCYTKAR AACRAAGLQD CM.VCX3X)L WICESAGVQ 2750 

EDAASDE^AFT EAMIRYSAPP GDPPQPEYDL ELITSCSSN\/ SVAHDGAGKR 2800 

VYYLTERDPIT PLARAAWEim PHTPVNSWLG NUMFAPTLM AEMrLMIHFF 2850 

SVLIARDQLZ QALNCEIYGZV CYSIEPLiXP PIIQRLHGIS AFSIHSYSPG 2900 

EINR^/Z^ACLR KLGVPPIi^ EEEARSVRAR LLSRQGE^AAI GGKYUTSKaIAV 2950 

RTKLiOjTPIA AAGE^LSGW FIAGYSQGDI YHSVSHARPR WFWF TT ,T ,T i T A 3000 

AGVGIYLifN R 3011 

FIG. I6H 



#1a. 3' Deletion mutants of pCV-H77C 

Sequence of 3' untranslated region of pCV-H77C 
5'UTR r 



ORF 



i''^3ntst 81nts 



- J'UTR 



( 3* variable region; 43 nts ) 3' variable polyU-UC 3' conserved 

•iSh (Stop codon for polyprotein) '^"^ region region 

AGGTTGGGGT AAACACTCCG GCCT CTTAAG CCATTTCCTG 

(poly U-UC region; 81 nts) " 

TTTTTTTTTT xTTTTTTTTT TTTTTTTTCT txTTTTTTTT CTTTCCTTTC 
CTTCTTTTTT TCCTTTCTTT TTCCCTTCTT T 

(3' conserved region; 101 nts) 

AATGGTGGCT CCATCTTAGC CCTAGTCACG GCTAGCTGTG AAAGGTCCGT 
GAGCCGCATG ACTGCAGAGA GTGCTGATAC TGGCCTCTCT GCAGATCATG 



#1a -1. pCV-H77C(-98X) ; 3' 98 nucleotides removed from pCV-H77C 

TGAAGGTTGG GGTAAACACT CCGGCCT CTT AAG CCATTTC CTGTTTTTTT 
TTTTTTTTTT xxTTTTTTTT TCTTTTTTTT TTTCTTTCCT TTCCTTCTTT 
TTTTCCTTTC TTTTTCCCTT CTTTAAT 

#1a -2. pCV-H77C(-42X) ; 3' 42 nucleotides removed from pCV-H77C 

TGAA GGTTGG GGTAAACACT CCGGCCT CTT AAG CCATTTC CTGTTTTTTT 
TTTTTTTTTT TTTTTTTTTT TCTTTTTTTT TTTCTTTCCT TTCCTTCTTT 
TTTTCCTTTC TTTTTCCCTT CTTTAATGGT GGCTCCATCT TAGCCCTAGT 
CACGGCTAGC TGTGAAAGGT CCGTGAGCCG CAT 

#1a -3. pCV-H77C(X-52) ; All of the 3' UTR sequence, except 3' 49 nucleotides, 
removed from pCV-H77C 

TGAGCCGCAT GACTGCAGAG AGTGCTGATA CTGGCCTCTC TGCAGATCAT 
GT 

FIG. I7A 



#1a -4. pCV-H77C(X) ; All of the 3' UTR sequence, except 3' 101 nucleotides, 
removed from pCV-H77C 

TGAAATGGTG GCTCCATCTT AGCCCTAGTC ACGGCTAGCT GTGAAAGGTC 
CGTGAGCCGC ATGACTGCAG AGAGTGCTGA TACTGGCCTC TCTGCAGATC 
ATGT 

#1a -5. pCV-H77C(+49X) ; The proximal 49 nucleotides of the 3' conserved 
region ( 98 nucleotides ; AAT not included) removed from pCV-H77C 

TGAA GGTTGG GGTAAACACT CCGGCC rcrT AAG CCATTTC CTGTTTTTTT 
TTTTTTTTTT txttTTTXTT TCTTTXTTTT TTTCTTTCCT TTCCTTCTTT 
TTTTCCTTTC TTTTTCCCTT CTTTAATGCC GCATGACTGC AGAGAGTGCT 
GATACTGGCC TCTCTGCAGA TCATGT 

#1a -6. pCV-H77C(VR-24) ; First 24 nucleotides of the 3' variable region 
removed from pCV-H77C 

TGACTTAAGC CATTXCCTGT TTTTTTTTTT XTTTTTTTTT TTTTTTTCTT 
TTTTTTTTTC TTTCCTTTCC TTCTTTTTTT CCTTTCTTTT TCCCTTCTTT 
AATGGTGGCT CCATCTTAGC CCTAGTCACG GCTAGCTGTG AAAGGTCCGT 
GAGCCGCATG ACTGCAGAGA GTGCTGATAC TGGCCTCTCT GCAGATCATG 
T 

#1a -7. pCV-H77C(-U/UC) ; Poly U-UC region removed from pCV-H77C 

TGAAGGTTGG GGTAAACACT CCGGCCT CTT AAG CCATTTC CTGAATGGTG 
GCTCCATCTT AGCCCTAGTC ACGGCTAGCX GTGAAAGGTC CGTGAGCCGC 
ATGACTGCAG AGAGTGCTGA TACTGGCCTC TCTGCAGATC ATGT 

FIG. I7B 



#1b. Strategy of 3' Deletion mutants 

#1b-1.pCV-H77C(-98X) 



3" variable 
region 



poly U-UC 
region 



lAATI 



3' conserved 
region 
(98 nts) 



Afl II (9403) 



PCR 



Xbal 



1. PCR Amplification 

2. Purification of PCR products 

3. Digestion with Afl II and Xba 1 

4. Cloning of AflM /Xba I fragment into pCV-H77C 

5. Complete sequence analysis 

6. in vitro transcription (within 24 hours of inoculation) 

7. Percutaneous intrahepatic transfection into chimpanzee ; 11/26/97 and 12/17/97 

8. Result : Negative ( No replication) 



#1b-2. pCV-H77C(-42X) 

3' variable Po'V U-UC 
region region 

I I 



3' conserved 
region 
i (42 nts) 



Afl 11 (9403) 

Synthesized 0!iaonucfeotides 



Nhe I (9530) 



Xba I* 



1. Synthesis of oligonucleotides ( sense and anti-sense ) 

2. Hybridization of oligonucleotides 

3. Digestion with Nhe 1 and Xba I 

4. Cloning of Nhe I JXba 1 fragment into pG9-KL26 (3' UTR of H77C) 

5. Sequence analysis 

6. Cloning of 3' UTR ( -42X ) [Afl II /Xba I fragment] into pCV-H77C 

7. Complete sequence analysis 

8. in vitro transcription (within 24 hours of inoculation) 

9. Percutaneous intrahepatic transfection into chimpanzee (Schedule; 1/22/98, 2/5/98 ) 



FIG. 170 



#1b -3. pCV-H77C(X-52) 



NS5B 



NfJe\ (9160) 



3' variable 
region 



TGAI 



poly U-UC 
region 



3' conserved 
region 

(52nts) I (49nts) 



Pfu PCR 



Svnthesiz ed'Oflaonucleotides 



. Xba I 



Fusion and Extension 



1. Fragment a ; Pfu PCR amplification and purification 

2. Fragment b ; Synthesized oligonucleotides (anti-sense) 

3. Fusion and extension 

4. TA cloning 

5. Sequence analysis 

6. Cloning Nde \-Xba I fragment with correct sequence into pCV-H77C 

7. Complete sequence analysis 

8. In Wfro transcription (within 24 hours of inoculation) 

9. Percutaneous intrahepatic transfection into chimpanzee 



FIG. I7D 



#1b-4. pCV-H77C(X) 



region 



3' variable 



poly U-UC 
region 



3' conserved 
region 
I (101 nts) 




Aide I (9160) 



NS5B 



Synthesized Oligonudeotides 




Pfu PCR 



c 

A 

Xba I* 



Fusion and Extension 



a 



c 



1. Fragment a ; Pfu PCR amplification and purification 

2. Fragment c ; Synthesized oligonucleotides (anti-sense) 

3. Fusion and extension 

4. TA cloning 

5. Sequence analysis 

6. Cloning Nde \-Xba I fragment with correct sequence into pCV-H77C 

7. Complete sequence analysis 

8. In vitro transcription (within 24 hours of inoculation) 

9. Percutaneous intrahepatic transfection into chimpanzee 



FIG. I7E 



#1b-5. pCV-H77C(+49X) 



3' variable 

region 
I I 



poly U-UC 
region 



3' conserved 
region 



i 



(49nts) 




Synthesized Oligonucleotides 



Pfu PCR 



lilt! 



> 



Xba\* 



Fusion and Extension 



d 



e 



1. Fragment d ; Pfu PCR amplification and purification 

2. Fragment e ; Synthesized oligonucleotides (anti-sense) 

3. Fusion and extension 

4. TA cloning 

5. Sequence analysis 

6. Cloning AfiW-Xba I fragment witli correct sequence into pCV-H77C 

7. Complete sequence analysis 

8. In vitro transcription (withiin 24 hours of inoculation) 

9. Percutaneous intrahepatic transfection into chimpanzee 



FIG. I7F 



#1b -6. pCV-H77C(VR-24) 





3' variable poly U-UC 


3' conserved 




region region 


region 


^ NS5B 


TGA|{24nts)| I | 


R 


; -f 

: 4/711(9403) 




- AWe 1(9160) 







^ Affil* 
— > <-T- 

1. PGR Amplification 

2. Purification of PGR products 

3. Digestion with Nde i and Afl I 

4. Cloning of A/de I /AflU fragment into pCV-H77G 

5. Complete sequence analysis 

6. in vitro transcription (within 24 hours of inoculation) 

7. Percutaneous intrahepatic transfection into chimpanzee 



#1b -7. pCV-H77C(-U/UC) 

3' variable poIy U-UC 3' conserved 

region region region 

>4// II (9403) : Nhe\ (9530) 

Synthesized Oligonucleotides iuMiSSSi^SS SS!! 

Afl II 

1. Synthesis of oligonucleotides ( sense and anti-sense ) 

2. Hybridization of oligonucleotides 

3. Digestion with Aflll and Nhe\ 

4. Cloning of Afl II and Nhe I fragment into pG9-KL26 

5. Sequence analysis 

6. Cloning of 3' UTR ( -poly U-UC ) [Afl II IXba I fragment] into pCV-H77C 

7. Complete sequence analysis 

8. in vitro transcription (within 24 hours of inoculation) 

9. Percutaneous intrahepatic transfection into chimpanzee 

FIG. I7G 
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SEQUENCE LISTING 



<110> Yanagi, Masayuki 

Emerson, Susanne U. 
Purcell, Robert H. 
Bukh , Jens 

<12 0> CLONED GENOMES OF INFECTIOUS HEPATITIS C VIRUSES AND 
USES THEREOF 

<130> 20264276US1 

<140> 
<141> 



<150> US 60/053 , 062 
<151> 1997-07-18 



<150> US 09/014,416 
<151> 1998-01-27 



<170> Patentin Ver . 2.1 

<210> 1 
<211> 3011 
<212> PRT 

<213> Hepatitis C virus 



<400> 1 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 



Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 



Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 



Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 



lie Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 



Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 



1 



Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
100 105 110 



Arg Arg Arg Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 

Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala Pro Leu 
130 135 140 

Gly Gly Ala Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150 155 160 

Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie 
165 170 175 

Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr 
180 185 190 

Gin Val Arg Asn Ser Ser Gly Leu Tyr His Val Thr Asn Asp Cys Pro 
195 200 205 

Asn Ser Ser lie Val Tyr Glu Ala Ala Asp Ala lie Leu His Thr Pro 
210 215 220 

Gly Cys Val Pro Cys Val Arg Glu Gly Asn Ala Ser Arg Cys Trp Val 
225 230 235 240 

Ala Val Thr Pro Thr Val Ala Thr Arg Asp Gly Lys Leu Pro Thr Thr 
245 250 255 

Gin Leu Arg Arg His lie Asp Leu Leu Val Gly Ser Ala Thr Leu Cys 
260 265 270 

Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Ser Val Phe Leu Val Gly 
275 280 285 

Gin Leu Phe Thr Phe Ser Pro Arg Arg His Trp Thr Thr Gin Asp Cys 
290 295 300 

Asn Cys Ser lie Tyr Pro Gly His lie Thr Gly His Arg Met Ala Trp 
305 310 315 320 

Asp Met Met Met Asn Trp Ser Pro Thr Ala Ala Leu Val Val Ala Gin 
325 330 335 

Leu Leu Arg lie Pro Gin Ala lie Met Asp Met lie Ala Gly Ala His 
340 345 350 

2 



Trp Gly Val Leu Ala Gly lie Ala Tyr Phe Ser Met Val Gly Asn Trp 
355 360 365 



Ala Lys Val Leu Val Val Leu Leu Leu Phe Ala Gly Val Asp Ala Glu 
370 375 380 

Thr His Val Thr Gly Gly Asn Ala Gly Arg Thr Thr Ala Gly Leu Val 
385 390 395 400 

Gly Leu Leu Thr Pro Gly Ala Lys Gin Asn lie Gin Leu lie Asn Thr 
405 410 415 

Asn Gly Ser Trp His lie Asn Ser Thr Ala Leu Asn Cys Asn Glu Ser 
420 425 430 

Leu Asn Thr Gly Trp Leu Ala Gly Leu Phe Tyr Gin His Lys Phe Asn 
435 440 445 

Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser Cys Arg Arg Leu Thr Asp 
450 455 460 

Phe Ala Gin Gly Trp Gly Pro lie Ser Tyr Ala Asn Gly Ser Gly Leu 
465 470 475 480 

Asp Glu Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Pro Cys Gly lie 
485 490 495 

Val Pro Ala Lys Ser Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser 
500 505 510 

Pro Val Val Val Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr Tyr Ser 
515 520 525 

Trp Gly Ala Asn Asp Thr Asp Val Phe Val Leu Asn Asn Thr Arg Pro 
530 535 540 

Pro Leu Gly Asn Trp Phe Gly Cys Thr Trp Met Asn Ser Thr Gly Phe 
545 550 555 560 

Thr Lys Val Cys Gly Ala Pro Pro Cys Val lie Gly Gly Val Gly Asn 
565 570 575 

Asn Thr Leu Leu Cys Pro Thr Asp Cys Phe Arg Lys His Pro Glu Ala 
580 585 590 

Thr Tyr Ser Arg Cys Gly Ser Gly Pro Trp lie Thr Pro Arg Cys Met 
595 600 SOS 



3 



Val Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr He Asn Tyr 
610 615 620 



Thr He Phe Lys Val Arg Met Tyr Val Gly Gly Val Glu His Arg Leu 
625 630 635 640 

Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu Glu Asp 
645 650 655 

Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr Gin Trp 
660 655 670 

Gin Val Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Ser Thr Gly 
675 680 685 

Leu He His Leu His Gin Asn He Val Asp Val Gin Tyr Leu Tyr Gly 
690 595 700 

Val Gly Ser Ser He Ala Ser Trp Ala He Lys Trp Glu Tyr Val Val 
705 710 715 720 

Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Cys Ser Cys Leu Trp 
725 730 735 

Met Met Leu Leu He Ser Gin Ala Glu Ala Ala Leu Glu Asn Leu Val 
740 745 750 

He Leu Asn Ala Ala Ser Leu Ala Gly Thr His Gly Leu Val Ser Phe 
755 760 765 

Leu Val Phe Phe Cys Phe Ala Trp Tyr Leu Lys Gly Arg Trp Val Pro 
770 775 780 

Gly Ala Val Tyr Ala Leu Tyr Gly Met Trp Pro Leu Leu Leu Leu Leu 
785 790 795 800 

Leu Ala Leu Pro Gin Arg Ala Tyr Ala Leu Asp Thr Glu Val Ala Ala 
805 810 815 

Ser Cys Gly Gly Val Val Leu Val Gly Leu Met Ala Leu Thr Leu Ser 
820 825 830 

Pro Tyr Tyr Lys Arg Tyr He Ser Trp Cys Met Trp Trp Leu Gin Tyr 
835 840 845 

Phe Leu Thr Arg Val Glu Ala Gin Leu His Val Trp Val Pro Pro Leu 
850 855 860 
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Asn Val Arg Gly Gly Arg Asp Ala Val lie Leu Leu Met Cys Val Val 
865 870 875 880 



His Pro Thr Leu Val Phe Asp lie Thr Lys Leu Leu Leu Ala lie Phe 
885 890 895 

Gly Pro Leu Trp lie Leu Gin Ala Ser Leu Leu Lys Val Pro Tyr Phe 
900 905 910 

Val Arg Val Gin Gly Leu Leu Arg lie Cys Ala Leu Ala Arg Lys lie 
915 920 925 

Ala Gly Gly His Tyr Val Gin Met Ala lie lie Lys Leu Gly Ala Leu 
930 935 940 

Thr Gly Thr Tyr Val Tyr Asn His Leu Thr Pro Leu Arg Asp Trp Ala 
945 950 955 960 

His Asn Gly Leu Arg Asp Leu Ala Val Ala Val Glu Pro Val Val Phe 
965 970 975 

Ser Arg Met Glu Thr Lys Leu lie Thr Trp Gly Ala Asp Thr Ala Ala 
980 985 990 

Cys Gly Asp lie lie Asn Gly Leu Pro Val Ser Ala Arg Arg Gly Gin 
995 1000 1005 

Glu lie Leu Leu Gly Pro Ala Asp Gly Met Val Ser Lys Gly Trp Arg 
1010 1015 1020 

Leu Leu Ala Pro lie Thr Ala Tyr Ala Gin Gin Thr Arg Gly Leu Leu 
1025 1030 1035 1040 

Gly Cys lie lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu 
1045 1050 1055 

Gly Glu Val Gin lie Val Ser Thr Ala Thr Gin Thr Phe Leu Ala Thr 
1060 1065 1070 

Cys lie Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Thr Arg 
1075 1080 1085 

Thr lie Ala Ser Pro Lys Gly Pro Val lie Gin Met Tyr Thr Asn Val 
1090 1095 1100 

Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly Ser Arg Ser Leu 
1105 1110 1115 1120 
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Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His 
1125 1130 1135 



Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu 
1140 1145 1150 

Leu Ser Pro Arg Pro He Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro 
1155 1160 1165 

Leu Leu Cys Pro Ala Gly His Ala Val Gly Leu Phe Arg Ala Ala Val 
1170 1175 1180 

Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe He Pro Val Glu Asn 
1185 1190 1195 1200 

Leu Gly Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro 
1205 1210 1215 

Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr 
1220 1225 1230 

Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly 
1235 1240 1245 

Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe 
1250 1255 1260 

Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp Pro Asn He Arg Thr 
1265 1270 1275 1280 

Gly Val Arg Thr He Thr Thr Gly Ser Pro He Thr Tyr Ser Thr Tyr 
1285 1290 1295 

Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He 
1300 1305 1310 

He He Cys Asp Glu Cys His Ser Thr Asp Ala Thr Ser He Leu Gly 
1315 1320 1325 

He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val 
1330 1335 1340 

Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Ser His Pro 
1345 1350 1355 1360 

Asn He Glu Glu Val Ala Leu Ser Thr Thr Gly Glu He Pro Phe Tyr 
1365 1370 1375 
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Gly Lys Ala lie Pro Leu Glu Val lie Lys Gly Gly Arg His Leu He 
1380 1385 1390 



Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val 
1395 1400 1405 

Ala Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser 
1410 1415 1420 

Val He Pro Thr Ser Gly Asp Val Val Val Val Ser Thr Asp Ala Leu 
1425 1430 1435 1440 

Met Thr Gly Phe Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr 
1445 1450 1455 

Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He 
1460 1465 1470 

Glu Thr Thr Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg 
1475 1480 1485 

Gly Arg Thr Gly Arg Gly Lys Pro Gly He Tyr Arg Phe Val Ala Pro 
1490 1495 1500 

Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys 
1505 1510 1515 1520 

Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr 
1525 1530 1535 

Val Arg Leu Arg Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin 
1540 1545 1550 

Asp His Leu Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr His He 
1555 1560 1565 

Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Phe Pro 
1570 1575 1580 

Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro 
1585 1590 1595 1600 

Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro 
1605 1610 1615 

Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin 
1620 1625 1630 
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Asn Glu Val Thr Leu Thr His Pro lie Thr Lys Tyr lie Met Thr Cys 
1635 1640 1645 



Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly 
1650 1655 1660 

Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val 
1665 1670 1675 1680 

Val lie Val Gly Arg lie Val Leu Ser Gly Lys Pro Ala lie lie Pro 
1685 1690 1695 

Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ser 
1700 1705 1710 

Gin His Leu Pro Tyr lie Glu Gin Gly Met Met Leu Ala Glu Gin Phe 
1715 1720 1725 

Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg His Ala Glu 
1730 1735 1740 

Val lie Thr Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu Val Phe 
1745 1750 1755 1760 

Trp Ala Lys His Met Trp Asn Phe lie Ser Gly lie Gin Tyr Leu Ala 
1765 1770 1775 

Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala lie Ala Ser Leu Met Ala 
1780 1785 1790 

Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Gly Gin Thr Leu Leu 
1795 1800 1805 

Phe Asn lie Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala Pro Gly 
1810 1815 1820 

Ala Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala lie Gly 
1825 1830 1835 1840 

Ser Val Gly Leu Gly Lys Val Leu Val Asp lie Leu Ala Gly Tyr Gly 
1845 1850 1855 

Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys lie Met Ser Gly Glu 
1860 1865 1870 



Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala lie Leu Ser 
1875 1880 1885 



Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala lie Leu Arg Arg 
1890 1895 1900 



His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu lie 
1905 1910 1915 1920 

Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro 
1925 1930 1935 

Glu Ser Asp Ala Ala Ala Arg Val Thr Ala lie Leu Ser Ser Leu Thr 
1940 1945 1950 

Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp lie Ser Ser Glu Cys 
1955 1960 1965 

Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Trp He 
1970 1975 1980 

Cys Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met 
1985 1990 1995 2000 

Pro Gin Leu Pro Gly He Pro Phe Val Ser Cys Gin Arg Gly Tyr Arg 
2005 2010 2015 

Gly Val Trp Arg Gly Asp Gly He Met His Thr Arg Cys His Cys Gly 
2020 2025 2030 

Ala Glu He Thr Gly His Val Lys Asn Gly Thr Met Arg He Val Gly 
2035 2040 2045 

Pro Arg Thr Cys Arg Asn Met Trp Ser Gly Thr Phe Pro He Asn Ala 
2050 2055 2060 

Tyr Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Lys Phe 
2065 2070 2075 2080 

Ala Leu Trp Arg Val Ser Ala Glu Glu Tyr Val Glu He Arg Arg Val 
2085 2090 2095 

Gly Asp Phe His Tyr Val Ser Gly Met Thr Thr Asp Asn Leu Lys Cys 
2100 2105 2110 

Pro Cys Gin He Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly Val 
2115 2120 2125 

Arg Leu His Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu 
2130 2135 2140 
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Val Ser Phe Arg Val Gly Leu His Glu Tyr Pro Val Gly Ser Gin Leu 
2145 2150 2155 2160 



Pro Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr 
2165 2170 2175 

Asp Pro Ser His lie Thr Ala Glu Ala Ala Gly Arg Arg Leu Ala Arg 
2180 2185 2190 

Gly Ser Pro Pro Ser Met Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala 
2195 2200 2205 

Pro Ser Leu Lys Ala Thr Cys Thr Ala Asn His Asp Ser Pro Asp Ala 
2210 2215 2220 

Glu Leu lie Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn 
2225 2230 2235 2240 

lie Thr Arg Val Glu Ser Glu Asn Lys Val Val lie Leu Asp Ser Phe 
2245 2250 2255 

Asp Pro Leu Val Ala Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ala 
2260 2265 2270 

Glu lie Leu Arg Lys Ser Arg Arg Phe Ala Arg Ala Leu Pro Val Trp 
2275 2280 2285 

Ala Arg Pro Asp Tyr Asn Pro Pro Leu Val Glu Thr Trp Lys Lys Pro 
2290 2295 2300 

Asp Tyr Glu Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Pro Arg 
2305 2310 2315 2320 

Ser Pro Pro Val Pro Pro Pro Arg Lys Lys Arg Thr Val Val Leu Thr 
2325 2330 2335 

Glu Ser Thr Leu Ser Thr Ala Leu Ala Glu Leu Ala Thr Lys Ser Phe 
2340 2345 2350 

Gly Ser Ser Ser Thr Ser Gly lie Thr Gly Asp Asn Thr Thr Thr Ser 
2355 2360 2365 

Ser Glu Pro Ala Pro Ser Gly Cys Pro Pro Asp Ser Asp Val Glu Ser 
2370 2375 2380 

Tyr Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu 
2385 2390 2395 2400 
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Ser Asp Gly Ser Trp Ser Thr Val Ser Ser Gly Ala Asp Thr Glu Asp 
2405 2410 2415 



Val Val Cys Cys Ser Met Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr 
2420 2425 2430 

Pro Cys Ala Ala Glu Glu Gin Lys Leu Pro lie Asn Ala Leu Ser Asn 
2435 2440 2445 

Ser Leu Leu Arg His His Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser 
2450 2455 2460 

Ala Cys Gin Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu 
2465 2470 2475 2480 

Asp Ser His Tyr Gin Asp Val Leu Lys Glu Val Lys Ala Ala Ala Ser 
2485 2490 2495 

Lys Val Lys Ala Asn Leu Leu Ser Val Glu Glu Ala Cys Ser Leu Thr 
2500 2505 2510 

Pro Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val 
2515 2520 2525 

Arg Cys His Ala Arg Lys Ala Val Ala His He Asn Ser Val Trp Lys 
2530 2535 2540 

Asp Leu Leu Glu Asp Ser Val Thr Pro lie Asp Thr Thr He Met Ala 
2545 2550 2555 2560 

Lys Asn Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro 
2565 2570 2575 

Ala Arg Leu He Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys 
2580 2585 2590 

Met Ala Leu Tyr Asp Val Val Ser Lys Leu Pro Leu Ala Val Met Gly 
2595 2600 2605 

Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu 
2610 2615 2620 

Val Gin Ala Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp 
2625 2630 2635 2640 

Thr Arg Cys Phe Asp Ser Thr Val Thr Glu Ser Asp He Arg Thr Glu 
2645 2650 2655 
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Glu Ala lie Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala Arg Val Ala 
2660 2665 2670 



lie Lys Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn 
2575 2680 2685 

Ser Arg Gly Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val 
2690 2695 2700 

Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr lie Lys Ala Arg 
2705 2710 2715 2720 

Ala Ala Cys Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val Cys 
2725 2730 2735 

Gly Asp Asp Leu Val Val He Cys Glu Ser Ala Gly Val Gin Glu Asp 
2740 2745 2750 

Ala Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala 
2755 2760 2765 

Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu He Thr 
2770 2775 2780 

Ser Cys Ser Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys Arg 
2785 2790 2795 2800 

Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala 
2805 2810 2815 

Trp Glu Thr Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn He 
2820 2825 2830 

He Met Phe Ala Pro Thr Leu Trp Ala Arg Met He Leu Met Thr His 
2835 2840 2845 

Phe Phe Ser Val Leu He Ala Arg Asp Gin Leu Glu Gin Ala Leu Asn 
2850 2855 2860 

Cys Glu He Tyr Gly Ala Cys Tyr Ser He Glu Pro Leu Asp Leu Pro 
2865 2870 2875 2880 

Pro He He Gin Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser 
2885 2890 2895 

Tyr Ser Pro Gly Glu He Asn Arg Val Ala Ala Cys Leu Arg Lys Leu 
2900 2905 2910 
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Gly Val Pro Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg 
2915 2920 2925 



Ala Arg Leu Leu Ser Arg Gly Gly Arg Ala Ala He Cys Gly Lys Tyr 
2930 2935 2940 

Leu Phe Asn Trp Ala Val Arg Thr Lys Leu Lys Leu Thr Pro He Ala 
2945 2950 2955 2960 

Ala Ala Gly Arg Leu Asp Leu Ser Gly Trp Phe Thr Ala Gly Tyr Ser 
2965 2970 2975 

Gly Gly Asp He Tyr His Ser Val Ser His Ala Arg Pro Arg Trp Phe 
2980 2985 2990 

Trp Phe Cys Leu Leu Leu Leu Ala Ala Gly Val Gly He Tyr Leu Leu 
2995 3000 3005 

Pro Asn Arg 
3010 



<210> 2 
<211> 9599 
<212> DNA 

<213> Hepatitis C virus 
<400> 2 

gccagccccc tgatgggggc gacactccac catgaatcac tcccctgtga ggaactactg 60 
tcttcacgca gaaagcgtct agccatggcg ttagtatgag tgtcgtgcag cctccaggac 12 0 
cccccctccc gggagagcca tagtggtctg cggaaccggt gagtacaccg gaattgccag 180 
gacgaccggg tcctttcttg gataaacccg ctcaatgcct ggagatttgg gcgtgccccc 240 
gcaagactgc tagccgagta gtgttgggtc gcgaaaggcc ttgtggtact gcctgatagg 3 00 
gtgcttgcga gtgccccggg aggtctcgta gaccgtgcac catgagcacg aatcctaaac 3 60 
ctcaaagaaa aaccaaacgt aacaccaacc gtcgcccaca ggacgtcaag ttcccgggtg 42 0 
gcggtcagat cgttggtgga gtttacttgt tgccgcgcag gggccctaga ttgggtgtgc 4 80 
gcgcgacgag gaagacttcc gagcggtcgc aacctcgagg tagacgtcag cctatcccca 54 0 
aggcacgtcg gcccgagggc aggacctggg ctcagcccgg gtacccttgg cccctctatg 600 
gcaatgaggg ttgcgggtgg gcgggatggc tcctgtctcc ccgtggctct cggcctagct 66 0 
ggggccccac agacccccgg cgtaggtcgc gcaatttggg taaggtcatc gataccctta 72 0 
cgtgcggctt cgccgacctc atggggtaca taccgctcgt cggcgcccct cttggaggcg 780 
ctgccagggc cctggcgcat ggcgtccggg ttctggaaga cggcgtgaac tatgcaacag 84 0 
ggaaccttcc tggttgctct ttctctatct tccttctggc cctgctctct tgcctgactg 900 
tgcccgcttc agcctaccaa gtgcgcaatt cctcggggct ttaccatgtc accaatgatt 960 
gccctaactc gagtattgtg tacgaggcgg ccgatgccat cctgcacact ccggggtgtg 102 0 
tcccttgcgt tcgcgagggt aacgcctcga ggtgttgggt ggcggtgacc cccacggtgg 1080 
ccaccaggga cggcaaactc cccacaacgc agcttcgacg tcatatcgat ctgcttgtcg 1140 
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ggagcgccac cctctgctcg gccctctacg 
ttggtcaact gtttaccttc tctcccaggc 
ctatctatcc cggccatata acgggtcatc 
cccctacggc agcgttggtg gtagctcagc 
tgatcgctgg tgctcactgg ggagtcctgg 
actgggcgaa ggtcctggta gtgctgctgc 
tcaccggggg aaatgccggc cgcaccacgg 
ccaagcagaa catccaactg atcaacacca 
tgaattgcaa tgaaagcctt aacaccggct 
tcaactcttc aggctgtcct gagaggttgg 
agggctgggg tcctatcagt tatgccaacg 
ggcactaccc tccaagacct tgtggcattg 
attgcttcac tcccagcccc gtggtggtgg 
acagctgggg tgcaaatgat acggatgtct 
gcaattggtt cggttgtacc tggatgaact 
ccccttgtgt catcggaggg gtgggcaaca 
gcaaacatcc ggaagccaca tactctcggt 
gcatggtcga ctacccgtat aggctttggc 
tcaaagtcag gatgtacgtg ggaggggtcg 
cgcggggcga acgctgtgat ctggaagaca 
tgtccaccac acagtggcag gtccttccgt 
ccggcctcat ccacctccac cagaacattg 
caagcatcgc gtcctgggcc attaagtggg 
cagacgcgcg cgtctgctcc tgcttgtgga 
ctttggagaa cctcgtaata ctcaatgcag 
ccttcctcgt gttcttctgc tttgcgtggt 
tctacgccct ctacgggatg tggcctctcc 
catacgcact ggacacggag gtggccgcgt 
tggcgctgac tctgtcgcca tattacaagc 
agtattttct gaccagagta gaagcgcaac 
99g99g99cg cgatgccgtc atcttactca 
acatcaccaa actactcctg gccatcttcg 
ttaaagtccc ctacttcgtg cgcgttcaag 
agatagccgg aggtcattac gtgcaaatgg 
cctatgtgta taaccatctc acccctcttc 
tggccgtggc tgtggaacca gtcgtcttct 
gggcagatac cgccgcgtgc ggtgacatca 
gccaggagat actgcttggg ccagccgacg 
cgcccatcac ggcgtacgcc cagcagacga 
tgactggccg ggacaaaaac caagtggagg 
aaaccttcct ggcaacgtgc atcaatgggg 
cgaggaccat cgcatcaccc aagggtcctg 
accttgtggg ctggcccgct cctcaaggtt 
cctcggacct ttacctggtc acgaggcacg 
atagcagggg tagcctgctt tcgccccggc 
gtccgctgtt gtgccccgcg ggacacgccg 
gtggagtggc taaagcggtg gactttatcc 
ccccggtgtt cacggacaac tcctctccac 



tgggggacct gtgcgggtct gtctttcttg 1200 
gccactggac gacgcaagac tgcaattgtt 1260 
gcatggcatg ggatatgatg atgaactggt 132 0 
tgctccggat cccacaagcc atcatggaca 1380 
cgggcatagc gtatttctcc atggtgggga 1440 
tatttgccgg cgtcgacgcg gaaacccacg 1500 
ctgggcttgt tggtctcctt acaccaggcg 1550 
acggcagttg gcacatcaat agcacggcct 1620 
ggttagcagg gctcttctat caacacaaat 1680 
ccagctgccg acgccttacc gattttgccc 1740 
gaagcggcct cgacgaacgc ccctactgct 1800 
tgcccgcaaa gagcgtgtgt ggcccggtat 1860 
gaacgaccga caggtcgggc gcgcctacct 192 0 
tcgtccttaa caacaccagg ccaccgctgg 1980 
caactggatt caccaaagtg tgcggagcgc 204 0 
acaccttgct ctgccccact gattgcttcc 2100 
gcggctccgg tccctggatt acacccaggt 2160 
actatccttg taccatcaat tacaccatat 2220 
agcacaggct ggaagcggcc tgcaactgga 22 8 0 
gggacaggtc cgagctcagc ccgttgctgc 2340 
gttctttcac gaccctgcca gccttgtcca 2400 
tggacgtgca gtacttgtac ggggtagggt 2460 
agtacgtcgt tctcctgttc cttctgcttg 2520 
tgatgttact catatcccaa gcggaggcgg 2580 
catccctggc cgggacgcac ggtcttgtgt 2 64 0 
atctgaaggg taggtgggtg cccggagcgg 2700 
tcctgctcct gctggcgttg cctcagcggg 2760 
cgtgtggcgg cgttgttctt gtcgggttaa 2 82 0 
gctatatcag ctggtgcatg tggtggcttc 2880 
tgcacgtgtg ggttcccccc ctcaacgtcc 2940 
tgtgtgtagt acacccgacc ctggtatttg 3 0 00 
gacccctttg gattcttcaa gccagtttgc 3060 
gccttctccg gatctgcgcg ctagcgcgga 3120 
ccatcatcaa gttaggggcg cttactggca 3180 
gagactgggc gcacaacggc ctgcgagatc 3240 
cccgaatgga gaccaagctc atcacgtggg 33 00 
tcaacggctt gcccgtctct gcccgtaggg 3360 
gaatggtctc caaggggtgg aggttgctgg 34 2 0 
gaggcctcct agggtgtata atcaccagcc 34 8 0 
gtgaggtcca gatcgtgtca actgctaccc 354 0 
tatgctggac tgtctaccac ggggccggaa 3600 
tcatccagat gtataccaat gtggaccaag 3660 
cccgctcatt gacaccctgt acctgcggct 372 0 
ccgatgtcat tcccgtgcgc cggcgaggtg 3780 
ccatttccta cttgaaaggc tcctcggggg 384 0 
tgggcctatt cagggccgcg gtgtgcaccc 3 90 0 
ctgtggagaa cctagggaca accatgagat 3 96 0 
cagcagtgcc ccagagcttc caggtggccc 402 0 
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acctgcatgc tcccaccggc agcggtaaga gcaccaaggt cccggctgcg tacgcagccc 4080 
agggctacaa ggtgttggtg ctcaacccct ctgttgctgc aacgctgggc tttggtgctt 4140 
acatgtccaa ggcccatggg gttgatccta atatcaggac cggggtgaga acaattacca 42 00 
ctggcagccc catcacgtac tccacctacg gcaagttcct tgccgacggc gggtgctcag 4260 
gaggtgctta tgacataata atttgtgacg agtgccactc cacggatgcc acatccatct 4320 
tgggcatcgg cactgtcctt gaccaagcag agactgcggg ggcgagactg gttgtgctcg 4380 
ccactgctac ccctccgggc tccgtcactg tgtcccatcc taacatcgag gaggttgctc 4440 
tgtccaccac cggagagatc cccttttacg gcaaggctat ccccctcgag gtgatcaagg 4500 
ggggaagaca tctcatcttc tgccactcaa agaagaagtg cgacgagctc gccgcgaagc 4560 
tggtcgcatt gggcatcaat gccgtggcct actaccgcgg tcttgacgtg tctgtcatcc 462 0 
cgaccagcgg cgatgttgtc gtcgtgtcga ccgatgctct catgactggc tttaccggcg 4680 
acttcgactc tgtgatagac tgcaacacgt gtgtcactca gacagtcgat ttcagccttg 4740 
accctacctt taccattgag acaaccacgc tcccccagga tgctgtctcc aggactcaac 4800 
gccggggcag gactggcagg gggaagccag gcatctatag atttgtggca ccgggggagc 4860 
gcccctccgg catgttcgac tcgtccgtcc tctgtgagtg ctatgacgcg ggctgtgctt 4920 
ggtatgagct cacgcccgcc gagactacag ttaggctacg agcgtacatg aacaccccgg 4 980 
ggcttcccgt gtgccaggac catcttgaat tttgggaggg cgtctttacg ggcctcactc 5 040 
atatagatgc ccacttttta tcccagacaa agcagagtgg ggagaacttt ccttacctgg 5100 
tagcgtacca agccaccgtg tgcgctaggg ctcaagcccc tcccccatcg tgggaccaga 5160 
tgtggaagtg tttgatccgc cttaaaccca ccctccatgg gccaacaccc ctgctataca 5220 
gactgggcgc tgttcagaat gaagtcaccc tgacgcaccc aatcaccaaa tacatcatga 5280 
catgcatgtc ggccgacctg gaggtcgtca cgagcacctg ggtgctcgtt ggcggcgtcc 5340 
tggctgctct ggccgcgtat tgcctgtcaa caggctgcgt ggtcatagtg ggcaggatcg 54 0 0 
tcttgtccgg gaagccggca attatacctg acagggaggt tctctaccag gagttcgatg 5460 
agatggaaga gtgctctcag cacttaccgt acatcgagca agggatgatg ctcgctgagc 552 0 
agttcaagca gaaggccctc ggcctcctgc agaccgcgtc ccgccatgca gaggttatca 5580 
cccctgctgt ccagaccaac tggcagaaac tcgaggtctt ttgggcgaag cacatgtgga 564 0 
atttcatcag tgggatacaa tacttggcgg gcctgtcaac gctgcctggt aaccccgcca 5700 
ttgcttcatt gatggctttt acagctgccg tcaccagccc actaaccact ggccaaaccc 5760 
tcctcttcaa catattgggg gggtgggtgg ctgcccagct cgccgccccc ggtgccgcta 5 82 0 
ctgcctttgt gggtgctggc ctagctggcg ccgccatcgg cagcgttgga ctggggaagg 5880 
tcctcgtgga cattcttgca gggtatggcg cgggcgtggc gggagctctt gtagcattca 5 94 0 
agatcatgag cggtgaggtc ccctccacgg aggacctggt caatctgctg cccgccatcc 6000 
tctcgcctgg agcccttgta gtcggtgtgg tctgcgcagc aatactgcgc cggcacgttg 6060 
gcccgggcga gggggcagtg caatggatga accggctaat agccttcgcc tcccggggga 6120 
accatgtttc ccccacgcac tacgtgccgg agagcgatgc agccgcccgc gtcactgcca 6180 
tactcagcag cctcactgta acccagctcc tgaggcgact gcatcagtgg ataagctcgg 6240 
agtgtaccac tccatgctcc ggttcctggc taagggacat ctgggactgg atatgcgagg 63 00 
tgctgagcga ctttaagacc tggctgaaag ccaagctcat gccacaactg cctgggattc 63 60 
cctttgtgtc ctgccagcgc gggtataggg gggtctggcg aggagacggc attatgcaca 6420 
ctcgctgcca ctgtggagct gagatcactg gacatgtcaa aaacgggacg atgaggatcg 64 80 
tcggtcctag gacctgcagg aacatgtgga gtgggacgtt ccccattaac gcctacacca 6540 
cgggcccctg tactcccctt cctgcgccga actataagtt cgcgctgtgg agggtgtctg 6600 
cagaggaata cgtggagata aggcgggtgg gggacttcca ctacgtatcg ggtatgacta 6660 
ctgacaatct taaatgcccg tgccagatcc catcgcccga atttttcaca gaattggacg 672 0 
gggtgcgcct acacaggttt gcgccccctt gcaagccctt gctgcgggag gaggtatcat 67 80 
tcagagtagg actccacgag tacccggtgg ggtcgcaatt accttgcgag cccgaaccgg 6840 
acgtagccgt gttgacgtcc atgctcactg atccctccca tataacagca gaggcggccg 6900 

15 



ggagaaggtt ggcgagaggg tcaccccctt 
ccgctccatc tctcaaggca acttgcaccg 
tagaggctaa cctcctgtgg aggcaggaga 
agaacaaagt ggtgattctg gactccttcg 
aggtctccgt acctgcagaa attctgcgga 
tctgggcgcg gccggactac aaccccccgc 
aaccacctgt ggtccatggc tgcccgctac 
ctcggaaaaa gcgtacggtg gtcctcaccg 
ttgccaccaa aagttttggc agctcctcaa 
catcctctga gcccgcccct tctggctgcc 
ccatgccccc cctggagggg gagcctgggg 
cggtcagtag tggggccgac acggaagatg 
caggcgcact cgtcaccccg tgcgctgcgg 
gcaactcgtt gctacgccat cacaatctgg 
aaaggcagaa gaaagtcaca tttgacagac 
tgctcaagga ggtcaaagca gcggcgtcaa 
aagcttgcag cctgacgccc ccacattcag 
acgtccgttg ccatgccaga aaggccgtag 
tggaagacag tgtaacacca atagacacta 
ttcagcctga gaaggggggt cgtaagccag 
tgcgcgtgtg cgagaagatg gccctgtacg 
tgggaagctc ctacggattc caatactcac 
cgtggaagtc caagaagacc ccgatggggt 
cagtcactga gagcgacatc cgtacggagg 
cccaagcccg cgtggccatc aagtccctca 
ccaattcaag gggggaaaac tgcggctacc 
ctagctgtgg taacaccctc acttgctaca 
ggctccagga ctgcaccatg ctcgtgtgtg 
cgggggtcca ggaggacgcg gcgagcctga 
ccgccccccc cggggacccc ccacaaccag 
cctccaacgt gtcagtcgcc cacgacggcg 
accctacaac ccccctcgcg agagccgcgt 
cctggctagg caacataatc atgtttgccc 
cccatttctt tagcgtcctc atagccaggg 
tctacggagc ctgctactcc atagaaccac 
atggcctcag cgcattttca ctccacagtt 
catgcctcag aaaacttggg gtcccgccct 
tccgcgctag gcttctgtcc agaggaggca 
actgggcagt aagaacaaag ctcaaactca 
tgtccggttg gttcacggct ggctacagcg 
cccggccccg ctggttctgg ttttgcctac 
tcctccccaa ccgatgaagg ttggggtaaa 
tttttttttt tttttttttt tttttctttt 
tttctttttc ccttctttaa tggtggctcc 
aggtccgtga gccgcatgac tgcagagagt 

<210> 3 



ctatggccag ctcctcggct agccagctgt 6960 
ccaaccatga ctcccctgac gccgagctca 7020 
tgggcggcaa catcaccagg gttgagtcag 7080 
atccgcttgt ggcagaggag gatgagcggg 714 0 
agtctcggag attcgcccgg gccctgcccg 7200 
tagtagagac gtggaaaaag cctgactacg 7260 
cacctccacg gtcccctcct gtgcctccgc 7320 
aatcaaccct atctactgcc ttggccgagc 7380 
cttccggcat tacgggcgac aatacgacaa 744 0 
cccccgactc cgacgttgag tcctattctt 7500 
atccggatct cagcgacggg tcatggtcga 7560 
tcgtgtgctg ctcaatgtct tattcctgga 7620 
aagaacaaaa actgcccatc aacgcactga 7680 
tgtattccac cacttcacgc agtgcttgcc 7740 
tgcaagttct ggacagccat taccaggacg 7800 
aagtgaaggc taacttgcta tccgtagagg 7860 
ccaaatccaa gtttggctat ggggcaaaag 7 92 0 
cccacatcaa ctccgtgtgg aaagaccttc 7980 
ccatcatggc caagaacgag gttttctgcg 8040 
ctcgtctcat cgtgttcccc gacctgggcg 8100 
acgtggttag caagctcccc ctggccgtga 8160 
caggacagcg ggttgaattc ctcgtgcaag 822 0 
tctcgtatga tacccgctgt tttgactcca 8280 
aggcaattta ccaatgttgt gacctggacc 834 0 
ctgagaggct ttatgttggg ggccctctta 8400 
gcaggtgccg cgcgagcggc gtactgacaa 8460 
tcaaggcccg ggcagcctgt cgagccgcag 852 0 
gcgacgactt agtcgttatc tgtgaaagtg 8580 
gagccttcac ggaggctatg accaggtact 864 0 
aatacgactt ggagcttata acatcatgct 87 0 0 
ctggaaagag ggtctactac cttacccgtg 8760 
gggagacagc aagacacact ccagtcaatt 8 82 0 
ccacactgtg ggcgaggatg atactgatga 8 88 0 
atcagcttga acaggctctt aactgtgaga 8 94 0 
tggatctacc tccaatcatt caaagactcc 9000 
actctccagg tgaaatcaat agggtggccg 9060 
tgcgagcttg gagacaccgg gcccggagcg 912 0 
gggctgccat atgtggcaag tacctcttca 9180 
ctccaatagc ggccgctggc cggctggact 9240 
ggggagacat ttatcacagc gtgtctcatg 9300 
tcctgctcgc tgcaggggta ggcatctacc 9360 
cactccggcc tcttaagcca tttcctgttt 9420 
tttttttctt tcctttcctt ctttttttcc 9480 
atcttagccc tagtcacggc tagctgtgaa 9540 
gctgatactg gcctctctgc agatcatgt 9599 
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<211> 3010 
<212> PRT 

<213> Hepatitis C virus 



<400> 3 

Met Ser Thr Asn Pro Lys Pro Gin 
1 5 

Arg Arg Pro Gin Asp Val Lys Phe 
20 

Gly Val Tyr Leu Leu Pro Arg Arg 

35 40 

Thr Arg Lys Ala Ser Glu Arg Ser 
50 55 

lie Pro Lys Ala Arg Arg Pro Glu 
65 70 

Tyr Pro Trp Pro Leu Tyr Gly Asn 
85 



Arg Lys Thr Lys Arg Asn Thr Asn 
10 15 

Pro Gly Gly Gly Gin lie Val Gly 
25 30 

Gly Pro Arg Leu Gly Val Arg Ala 
45 

Gin Pro Arg Gly Arg Arg Gin Pro 
50 

Gly Arg Ala Trp Ala Gin Pro Gly 
75 80 

Glu Gly Leu Gly Trp Ala Gly Trp 
90 95 



Leu Leu Ser Pro Arg Gly Ser Arg 
100 

Arg Arg Arg Ser Arg Asn Leu Gly 
115 120 

Gly Phe Ala Asp Leu Met Gly Tyr 
130 135 

Gly Gly Ala Ala Arg Ala Leu Ala 
145 150 

Gly Val Asn Tyr Ala Thr Gly Asn 
165 

Phe Leu Leu Ala Leu Leu Ser Cys 
180 

Glu Val Arg Asn Val Ser Gly lie 
195 200 

Asn Ser Ser lie Val Tyr Glu Ala 
210 215 

Gly Cys Val Pro Cys Val Gin Glu 



Pro Ser Trp Gly Pro Thr Asp Pro 
105 110 

Lys Val lie Asp Thr Leu Thr Cys 
125 

lie Pro Leu Val Gly Ala Pro Leu 
14 0 

His Gly Val Arg Val Leu Glu Asp 
155 160 

Leu Pro Gly Cys Ser Phe Ser lie 
170 175 

Leu Thr lie Pro Ala Ser Ala Tyr 
185 190 

Tyr His Val Thr Asn Asp Cys Ser 
205 

Ala Asp Val lie Met His Thr Pro 
220 

Gly Asn Ser Ser Arg Cys Trp Val 
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225 



230 



235 



240 



Ala Leu Thr Pro 



Thr lie Arg Arg 
260 

Ser Ala Met Tyr 
275 

Gin Leu Phe Thr 
290 

Asn Cys Ser lie 
305 

Asp Met Met Met 



Leu Leu Arg lie 
340 

Trp Gly Val Leu 
355 

Ala Lys Val Leu 
370 

Thr His Thr Thr 
385 

Ser Leu Phe Ser 



Asn Gly Ser Trp 
420 

Leu Gin Thr Gly 
435 

Ser Ser Gly Cys 
450 

Phe Ala Gin Gly 
465 

Asp Gin Arg Pro 



Thr Leu Ala Ala 

245 

His Val Asp Leu 



Val Gly Asp Leu 
280 

Phe Ser Pro Arg 
295 

Tyr Pro Gly His 
310 

Asn Trp Ser Pro 
325 

Pro Gin Ala Val 



Ala Gly Leu Ala 
360 

lie Val Ala Leu 
375 

Gly Arg Val Ala 
390 

Ser Gly Ala Ser 
405 

His lie Asn Arg 



Phe Phe Ala Ala 
440 

Pro Glu Arg Met 
455 

Trp Gly Pro lie 
470 

Tyr Cys Trp His 



Arg Asn Ala Ser 
250 

Leu Val Gly Thr 
265 

Cys Gly Ser lie 



Arg His Glu Thr 
300 

Val Ser Gly His 
315 

Thr Thr Ala Leu 
330 

Val Asp Met Val 
345 

Tyr Tyr Ser Met 



Leu Phe Ala Gly 
380 

Gly His Thr Thr 
395 

Gin Lys lie Gin 
410 

Thr Ala Leu Asn 
425 

Leu Phe Tyr Ala 



Ala Ser Cys Arg 
460 

Thr Tyr Thr Lys 
475 

Tyr Ala Pro Arg 



Val Pro Thr Thr 
255 

Ala Ala Phe Cys 
270 

Phe Leu Val Ser 
285 

Val Gin Asp Cys 



Arg Met Ala Trp 
320 

Val Val Ser Gin 
335 

Ala Gly Ala His 
350 

Val Gly Asn Trp 
365 

Val Asp Gly Glu 



Ser Gly Phe Thr 
400 

Leu Val Asn Thr 
415 

Cys Asn Asp Ser 
430 

His Lys Phe Asn 
445 

Pro lie Asp Trp 



Pro Asn Ser Ser 
480 

Pro Cys Gly Val 
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485 



490 



495 



Val Pro Ala Ser 
500 

Pro Val Val Val 
515 

Trp Gly Glu Asn 
530 

Pro Gin Gly Asn 
545 

Thr Lys Thr Cys 



Arg Thr Leu lie 
580 

Thr Tyr Thr Lys 
595 

Val Asp Tyr Pro 
610 

Ser lie Phe Lys 
625 

Asn Ala Ala Cys 



Arg Asp Arg Ser 
660 

Gin lie Leu Pro 
675 

Leu lie His Leu 
690 

Val Gly Ser Ala 
705 

Leu Leu Phe Leu 



Met Met Leu Leu 



Gin Val Cys Gly 



Gly Thr Thr Asp 
520 

Glu Thr Asp Val 
535 

Trp Phe Gly Cys 
550 

Gly Gly Pro Pro 
565 

Cys Pro Thr Asp 



Cys Gly Ser Gly 
600 

Tyr Arg Leu Trp 
615 

Val Arg Met Tyr 
630 

Asn Trp Thr Arg 
645 

Glu Leu Ser Pro 



Cys Ala Phe Thr 
680 

His Gin Asn lie 
695 

Phe Val Ser Phe 
710 

Leu Leu Ala Asp 
725 

He Ala Gin Ala 



Pro Val Tyr Cys 
505 

Arg Ser Gly Val 



Met Leu Leu Asn 
540 

Thr Trp Met Asn 
555 

Cys Asn He Gly 
570 

Cys Phe Arg Lys 
585 

Pro Trp Leu Thr 



His Tyr Pro Cys 
620 

Val Gly Gly Val 
635 

Gly Glu Arg Cys 
650 

Leu Leu Leu Ser 
665 

Thr Leu Pro Ala 



Val Asp Val Gin 
700 

Ala He Lys Trp 
715 

Ala Arg Val Cys 
730 

Glu Ala Ala Leu 



Phe Thr Pro Ser 
510 

Pro Thr Tyr Ser 
525 

Asn Thr Arg Pro 



Ser Thr Gly Phe 
560 

Gly Val Gly Asn 
575 

His Pro Glu Ala 
590 

Pro Arg Cys Leu 
605 

Thr Leu Asn Phe 



Glu His Arg Leu 
640 

Asn Leu Glu Asp 
655 

Thr Thr Glu Trp 
670 

Leu Ser Thr Gly 
685 

Tyr Leu Tyr Gly 



Glu Tyr He Leu 

720 

Ala Cys Leu Trp 
735 

Glu Asn Leu Val 



19 



740 



745 



750 



Val Leu Asn Ala 
755 

Leu Val Phe Phe 
770 

Gly Ala Ala Tyr 
785 

Leu Ala Leu Pro 



Ser Cys Gly Gly 
820 

Pro Tyr Tyr Lys 
835 

Phe lie Thr Arg 
850 

Asn Val Arg Gly 
865 

His Pro Glu Leu 



Gly Pro Leu Met 
900 

Val Arg Ala Gin 
915 

Ala Gly Gly His 
930 

Thr Gly Thr Tyr 
945 

His Ala Gly Leu 



Ser Ala Met Glu 
980 

Cys Gly Asp lie 



Ala Ser Val Ala 
760 

Cys Ala Ala Trp 
775 

Ala Phe Tyr Gly 
790 

Pro Arg Ala Tyr 
805 

Ala Val Leu Val 



Val Phe Leu Thr 
840 

Ala Glu Ala His 
855 

Gly Arg Asp Ala 
870 

lie Phe Asp lie 
885 

Val Leu Gin Ala 



Gly Leu lie Arg 
920 

Tyr Val Gin Met 
935 

Val Tyr Asn His 
950 

Arg Asp Leu Ala 
965 

Thr Lys Val He 



He Leu Gly Leu 



Gly Ala His Gly 



Tyr He Lys Gly 
780 

Val Trp Pro Leu 
795 

Ala Leu Asp Arg 
810 

Gly Leu Val Phe 
825 

Arg Leu He Trp 



Met Gin Val Trp 
860 

He He Leu Leu 
875 

Thr Lys Leu Leu 
890 

Gly He Thr Arg 
905 

Ala Cys Met Leu 



Val Phe Met Lys 
940 

Leu Thr Pro Leu 
955 

Val Ala Val Glu 
970 

Thr Trp Gly Ala 
985 

Pro Val Ser Ala 



He Leu Ser Phe 
765 

Arg Leu Ala Pro 



Leu Leu Leu Leu 
800 

Glu Met Ala Ala 
815 

Leu Thr Leu Ser 
830 

Trp Leu Gin Tyr 
845 

Val Pro Pro Leu 



Thr Cys Ala Val 
880 

Leu Ala He Leu 
895 

Val Pro Tyr Phe 
910 

Val Arg Lys Val 
925 

Leu Gly Ala Leu 



Arg Asp Trp Ala 
960 

Pro Val Val Phe 
975 

Asp Thr Ala Ala 
990 

Arg Arg Gly Lys 
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995 



1000 



1005 



Glu He Phe Leu Gly Pro Ala Asp Ser Leu Glu Gly Gin Gly Trp Arg 
1010 1015 1020 

Leu Leu Ala Pro He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Val Leu 
1025 1030 1035 1040 

Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu 
1045 1050 1055 

Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr 
1060 1065 1070 

Cys He Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys 
1075 1080 1085 

Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn Val 
1090 1095 1100 

Asp Leu Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Met 
1105 1110 1115 1120 

Thr Pro Cys Ser Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His 
1125 1130 1135 

Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu 
1140 1145 1150 

Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro 
1155 1160 1165 

Leu Leu Cys Pro Ser Gly His Val Val Gly Val Phe Arg Ala Ala Val 
1170 1175 1180 

Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe He Pro Val Glu Ser 
1185 1190 1195 1200 

Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Thr Pro 
1205 1210 1215 

Pro Ala Val Pro Gin Thr Phe Gin Val Ala His Leu His Ala Pro Thr 
1220 1225 1230 

Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly 
1235 1240 1245 

Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe 
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1250 



1255 



1260 



Gly Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg Thr 
1265 1270 1275 1280 

Gly Val Arg Thr He Thr Thr Gly Gly Ser He Thr Tyr Ser Thr Tyr 
1285 1290 1295 

Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He 
1300 1305 1310 

He He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu Gly 
1315 1320 1325 

He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val 
1330 1335 1340 

Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro 
1345 1350 1355 1360 

Asn He Glu Glu He Gly Leu Ser Asn Asn Gly Glu He Pro Phe Tyr 
1365 1370 1375 

Gly Lys Ala He Pro He Glu Ala He Lys Gly Gly Arg His Leu He 
1380 1385 1390 

Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Thr 
1395 1400 1405 

Gly Leu Gly Leu Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser 
1410 1415 1420 

Val He Pro Pro He Gly Asp Val Val Val Val Ala Thr Asp Ala Leu 
1425 1430 1435 1440 

Met Thr Gly Phe Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr 
1445 1450 1455 

Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He 
1460 1465 1470 

Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg 
1475 1480 1485 

Gly Arg Thr Gly Arg Gly Arg Ser Gly He Tyr Arg Phe Val Thr Pro 
1490 1495 1500 

Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys 
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1505 



1510 



1515 



1520 



Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser 
1525 1530 1535 

Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin 
1540 1545 1550 

Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His lie 
1555 1560 1565 

Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro 
1570 1575 1580 

Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro 
1585 1590 1595 1600 

Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro 
1605 1610 1615 

Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin 
1620 1625 1630 

Asn Glu Val He Leu Thr His Pro He Thr Lys Tyr lie Met Ala Cys 
1635 1640 1645 

Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly 
1650 1655 1660 

Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val 
1665 1670 1675 1680 

Val He Val Gly Arg He He Leu Ser Gly Lys Pro Ala Val Val Pro 
1685 1690 1695 

Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala 
1700 1705 1710 

Ser Gin Leu Pro Tyr He Glu Gin Gly Met Gin Leu Ala Glu Gin Phe 
1715 1720 1725 

Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu 
1730 1735 1740 

Ala Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe 
1745 1750 1755 1760 

Trp Ala Lys His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala 
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1765 



1770 



1775 



Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala lie Ala Ser Leu Met Ala 
1780 1785 1790 

Phe Thr Ala Ser lie Thr Ser Pro Leu Thr Thr Gin Asn Thr Leu Leu 
1795 1800 1805 

Phe Asn lie Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser 
1810 1815 1820 

Ala Ala Ser Ala Phe Val Gly Ala Gly lie Ala Gly Ala Ala Val Gly 
1825 1830 1835 1840 

Ser lie Gly Leu Gly Lys Val Leu Val Asp lie Leu Ala Gly Tyr Gly 
1845 1850 1855 

Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly Glu 
1860 1865 1870 

Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala lie Leu Ser 
1875 1880 1885 

Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala lie Leu Arg Arg 
1890 1895 1900 

His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He 
1905 1910 1915 1920 

Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro 
1925 1930 1935 

Glu Ser Asp Ala Ala Ala Arg Val Thr Gin He Leu Ser Ser Leu Thr 
1940 1945 1950 

He Thr Gin Leu Leu Lys Arg Leu His Gin Trp He Asn Glu Asp Cys 
1955 1960 1965 

Ser Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp Val Trp Asp Trp He 
1970 1975 1980 

Cys Thr Val Leu Thr Asp Phe Lys Thr Trp Leu Gin Ser Lys Leu Leu 
1985 1990 1995 2000 

Pro Arg Leu Pro Gly Val Pro Phe Leu Ser Cys Gin Arg Gly Tyr Lys 
2005 2010 2015 

Gly Val Trp Arg Gly Asp Gly He Met Gin Thr Thr Cys Pro Cys Gly 
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2020 



2025 



2030 



Ala Gin He Ala Gly His Val Lys Asn Gly Ser Met Arg He Val Gly 
2035 2040 2045 

Pro Arg Thr Cys Ser Asn Thr Trp His Gly Thr Phe Pro He Asn Ala 
2050 2055 2060 

Tyr Thr Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr Ser Arg 
2065 2070 2075 2080 

Ala Leu Trp Arg Val Ala Ala Glu Glu Tyr Val Glu Val Thr Arg Val 
2085 2090 2095 

Gly Asp Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn Val Lys Cys 
2100 2105 2110 

Pro Cys Gin Val Pro Ala Pro Glu Phe Phe Thr Glu Val Asp Gly Val 
2115 2120 2125 

Arg Leu His Arg Tyr Ala Pro Ala Cys Lys Pro Leu Leu Arg Glu Asp 
2130 2135 2140 

Val Thr Phe Gin Val Gly Leu Asn Gin Tyr Leu Val Gly Ser Gin Leu 
2145 2150 2155 2160 

Pro Cys Glu Pro Glu Pro Asp Val Thr Val Leu Thr Ser Met Leu Thr 
2165 2170 2175 

Asp Pro Ser His He Thr Ala Glu Thr Ala Lys Arg Arg Leu Ala Arg 
2180 2185 2190 

Gly Ser Pro Pro Ser Leu Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala 
2195 2200 2205 

Pro Ser Leu Lys Ala Thr Cys Thr Thr His His Asp Ser Pro Asp Ala 
2210 2215 2220 

Asp Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn 
2225 2230 2235 2240 

He Thr Arg Val Glu Ser Glu Asn Lys Val Val He Leu Asp Ser Phe 
2245 2250 2255 

Glu Pro Leu His Ala Glu Gly Asp Glu Arg Glu He Ser Val Ala Ala 
2260 2265 2270 

Glu He Leu Arg Lys Ser Arg Lys Phe Pro Ser Ala Leu Pro He Trp 
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Ala Arg Pro Asp Tyr Asn Pro Pro Leu Leu Glu Ser Trp Lys Asp Pro 
2290 2295 2300 

Asp Tyr Val Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Thr Lys 
2305 2310 2315 2320 

Ala Pro Pro He Pro Pro Pro Arg Arg Lys Arg Thr Val Val Leu Thr 
2325 2330 2335 

Glu Ser Asn Val Ser Ser Ala Leu Ala Glu Leu Ala Thr Lys Thr Phe 
2340 2345 2350 

Gly Ser Ser Gly Ser Ser Ala Val Asp Ser Gly Thr Ala Thr Ala Leu 
2355 2360 2365 

Pro Asp Leu Ala Ser Asp Asp Gly Asp Lys Gly Ser Asp Val Glu Ser 
2370 2375 2380 

Tyr Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu 
2385 2390 2395 2400 

Ser Asp Gly Ser Trp Ser Thr Val Ser Glu Glu Ala Ser Glu Asp Val 
2405 2410 2415 

Val Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu He Thr Pro 
2420 2425 2430 

Cys Ala Ala Glu Glu Ser Lys Leu Pro He Asn Pro Leu Ser Asn Ser 
2435 2440 2445 

Leu Leu Arg His His Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala 
2450 2455 2460 

Ser Leu Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp 
2465 2470 2475 2480 

Asp His Tyr Arg Asp Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr 
2485 2490 2495 

Val Lys Ala Lys Leu Leu Ser He Glu Glu Ala Cys Lys Leu Thr Pro 
2500 2505 2510 

Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg 
2515 2520 2525 

Asn Leu Ser Ser Arg Ala Val Asn His He Arg Ser Val Trp Glu Asp 



2530 



2535 



2540 



Leu Leu Glu Asp Thr Glu Thr Pro lie Asp Thr Thr lie Met Ala Lys 
2545 2550 2555 2560 

Ser Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala 
2565 2570 2575 

Arg Leu lie Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met 
2580 2585 2590 

Ala Leu Tyr Asp Val Val Ser Thr Leu Pro Gin Ala Val Met Gly Ser 
2595 2600 2605 

Ser Tyr Gly Phe Gin Tyr Ser Pro Lys Gin Arg Val Glu Phe Leu Val 
2610 2615 2620 

Asn Thr Trp Lys Ser Lys Lys Cys Pro Met Gly Phe Ser Tyr Asp Thr 
2625 2630 2635 2640 

Arg Cys Phe Asp Ser Thr Val Thr Glu Ser Asp lie Arg Val Glu Glu 
2645 2650 2655 

Ser lie Tyr Gin Cys Cys Asp Leu Ala Pro Glu Ala Arg Gin Ala lie 
2660 2665 2670 

Arg Ser Leu Thr Glu Arg Leu Tyr lie Gly Gly Pro Leu Thr Asn Ser 
2675 2680 2685 

Lys Gly Gin Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu 
2690 2695 2700 

Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Thr Ala 
2705 2710 2715 2720 

Ala Cys Arg Ala Ala Lys Leu Gin Asp Cys Thr Met Leu Val Asn Gly 
2725 2730 2735 

Asp Asp Leu Val Val lie Cys Glu Ser Ala Gly Thr Gin Glu Asp Ala 
2740 2745 2750 

Ala Ala Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro 
2755 2760 2765 

Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu lie Thr Ser 
2770 2775 2780 

Cys Ser Ser Asn Val Ser Val Ala His Asp Ala Ser Gly Lys Arg Val 
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2785 



2790 



2795 



2800 



Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp 
2805 2810 2815 

Glu Thr Ala Arg His Thr Pro He Asn Ser Trp Leu Gly Asn He He 
2820 2825 2830 

Met Tyr Ala Pro Thr Leu Trp Ala Arg Met He Leu Met Thr His Phe 
2835 2840 2845 

Phe Ser He Leu Leu Ala Gin Glu Gin Leu Glu Lys Ala Leu Asp Cys 
2850 2855 2860 

Gin He Tyr Gly Ala Cys Tyr Ser He Glu Pro Leu Asp Leu Pro Gin 
2865 2870 2875 2880 

He He Glu Arg Leu His Gly Leu Ser Ala Phe Thr Leu His Ser Tyr 
2885 2890 2895 

Ser Pro Gly Glu He Asn Arg Val Ala Ser Cys Leu Arg Lys Leu Gly 
2900 2905 2910 

Val Pro Pro Leu Arg Thr Trp Arg His Arg Ala Arg Ser Val Arg Ala 
2915 2920 2925 

Lys Leu Leu Ser Gin Gly Gly Arg Ala Ala Thr Cys Gly Arg Tyr Leu 
2930 2935 2940 

Phe Asn Trp Ala Val Arg Thr Lys Leu Lys Leu Thr Pro He Pro Ala 
2945 2950 2955 2960 

Ala Ser Gin Leu Asp Leu Ser Gly Trp Phe Val Ala Gly Tyr Ser Gly 
2965 2970 2975 

Gly Asp He Tyr His Ser Leu Ser Arg Ala Arg Pro Arg Trp Phe Pro 
2980 2985 2990 

Leu Cys Leu Leu Leu Leu Ser Val Gly Val Gly He Tyr Leu Leu Pro 
2995 3000 3005 



Asn Arg 
3010 



<210> 4 
<211> 9595 
<212> DNA 
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<213> Hepatitis C virus 



<400> 4 

gccagccccc tgatgggggc gacactccac 
tcttcacgca gaaagcgtct agccatggcg 
cccccctccc gggagagcca tagtggtctg 
gacgaccggg tcctttcttg gatcaacccg 
gcgagactgc tagccgagta gtgttgggtc 
gtgcttgcga gtgccccggg aggtctcgta 
ctcaaagaaa aaccaaacgt aacaccaacc 
gtggtcagat cgttggtgga gtttacctgt 
gcgcgactag gaaggcttcc gagcggtcgc 
aggctcgccg acccgagggc agggcctggg 
gcaatgaggg cctggggtgg gcaggatggc 
ggggccccac ggacccccgg cgtaggtcgc 
catgcggctt cgccgatctc atggggtaca 
ctgccagggc cttggcacac ggtgtccggg 
ggaacttgcc cggttgctct ttctctatct 
tcccagcttc cgcttatgaa gtgcgcaacg 
gctccaactc aagcattgtg tatgaggcag 
tgccctgtgt tcaggagggt aacagctccc 
cggccaggaa tgccagcgtc cccactacga 
ggacggctgc tttctgctcc gctatgtacg 
tctcccagct gttcaccttc tcgcctcgcc 
caatctatcc cggccatgta tcaggtcacc 
cacctacaac agccctagtg gtgtcgcagt 
tggtggcggg ggcccactgg ggagtcctgg 
actgggctaa ggttctgatt gtggcgctac 
cgacggggag ggtggccggc cacaccacct 
cgtctcagaa aatccagctt gtgaatacca 
taaattgcaa tgactccctc caaactgggt 
tcaactcgtc cgggtgcccg gagcgcatgg 
a3g33tg999 ccccatca.cc tatactaagc 
ggcattacgc gcctcgaccg tgtggtgtcg 
attgtttcac cccaagccct gttgtggtgg 
atagctgggg ggagaatgag acagacgtga 
gcaactggtt cggctgtaca tggatgaata 
ccccgtgtaa catcgggggg gtcggtaacc 
ggaagcaccc cgaggctact tacacaaaat 
gcctagtaga ctacccatac aggctttggc 
ttaaggttag gatgtatgtg gggggcgtgg 
ctcgaggaga gcgctgtaac ttggaggaca 
tgtctacaac agagtggcag atactgccct 
ctggtttgat ccatctccat cagaacatcg 
cagcgtttgt ctcctttgca atcaaatggg 
cagacgcgcg cgtgtgtgcc tgcttgtgga 
ccttagagaa cttggtggtc ctcaatgcgg 
cctttcttgt gttcttctgc gccgcctggt 



catgaatcac tcccctgtga ggaactactg 60 
ttagtatgag tgtcgtgcag cctccaggac 120 
cggaaccggt gagtacaccg gaattgccag 180 
ctcaatgcct ggagatttgg gcgtgccccc 240 
gcgaaaggcc ttgtggtact gcctgatagg 300 
gaccgtgcac catgagcacg aatcctaaac 360 
gccgcccaca ggacgtcaag ttcccgggcg 420 
tgccgcgcag gggccccagg ttgggtgtgc 480 
aacctcgtgg aaggcgacaa cctatcccaa 540 
ctcagcccgg gtacccttgg cccctctatg 600 
tcctgtcacc ccgcggctcc cggcctagtt 650 
gtaacttggg taaggtcatc gataccctta 720 
ttccgctcgt cggcgccccc ctagggggcg 780 
ttctggagga cggcgtgaac tatgcaacag 84 0 
tcctcttggc tctgctgtcc tgtttgacca 900 
tgtccgggat ataccatgtc acgaacgact 960 
cggacgtgat catgcatact cccgggtgcg 102 0 
gttgctgggt agcgctcact cccacgctcg 1080 
caatacgacg ccacgtcgac ttgctcgttg 114 0 
tgggggatct ctgcggatct attttcctcg 12 0 0 
ggcatgagac agtgcaggac tgcaactgct 12 6 0 
gcatggcttg ggatatgatg atgaactggt 132 0 
tgctccggat cccacaagct gtcgtggaca 1380 
cgggccttgc ctactattcc atggtaggga 144 0 
tctttgccgg cgttgacggg gagacccaca 1500 
ccgggttcac gtcccttttc tcatctgggg 1560 
acggcagctg gcacatcaac aggactgccc 162 0 
tctttgccgc gctgttttac gcacacaagt 1680 
ccagctgccg ccccattgac tggttcgccc 1740 
ctaacagctc ggatcagagg ccttattgct 18 0 0 
tacccgcgtc gcaggtgtgt ggtccagtgt 1860 
ggaccaccga tcgttccggt gtccctacgt 192 0 
tgctcctcaa caacacgcgt ccgccacaag 1980 
gtactgggtt cactaagacg tgcggaggtc 204 0 
gcaccttgat ctgccccacg gactgcttcc 2100 
gtggctcggg gccctggttg acacctaggt 2160 
actacccctg cactctcaat ttttccatct 2220 
agcacaggct caatgccgca tgcaattgga 22 8 0 
gggataggtc agaactcagc ccgctgctgc 234 0 
gtgctttcac caccctaccg gctttatcca 2400 
tggacgtgca atacctgtac ggtgtagggt 2460 
agtacatcct gttgcttttc cttctcctgg 252 0 
tgatgctgct gatagcccag gctgaggccg 25 80 
cgtccgtggc cggagcgcat ggtattctct 264 0 
acattaaggg caggctggct cctggggcgg 27 0 0 
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cgtatgcttt ttatggcgta tggccgctgc 
cttacgcctt ggaccgggag atggctgcat 
tattcttgac cttgtcacca tactacaaag 
aatactttat caccagagcc gaggcgcaca 
ggggaggccg cgatgccatc atcctcctca 
acatcaccaa actcctgctc gccatactcg 
cgagagtgcc gtacttcgtg cgcgctcaag 
aagtcgccgg gggtcattat gtccaaatgg 
cgtacgttta taaccatctt accccactgc 
ttgcggtggc ggtagagccc gtcgtcttct 
gagcagacac cgctgcgtgt ggggacatca 
ggaaggagat atttttggga ccggctgata 
cgcccatcac ggcctactcc caacaaacgc 
tcacaggccg ggacaagaac caggtcgaag 
aatctttcct ggcgacctgc atcaacggcg 
cgaagaccct agccggtcca aaaggtccaa 
acctcgtcgg ctggcaggcg ccccccgggg 
gctcggacct ttacttggtc acgagacatg 
acagcagggg aagtctactc tcccccaggc 
gtccattgct ttgcccttcg gggcacgtcg 
ggggggtcgc gaaggcggtg gacttcatac 
ctccggtctt cacagacaac tcaacccccc 
atctgcacgc tcctactggc agcggcaaga 
aagggtacaa ggtgctcgtc ctgaacccgt 
atatgtccaa ggcacacggt atcgacccta 
cgggcggctc cattacgtac tccacctatg 
ggggcgccta tgacatcata atatgtgatg 
tgggcatcgg cacagtcctg gaccaagcgg 
ccaccgctac acctccggga tcggttaccg 
tgtccaacaa tggagagatc cccttctatg 
gggggaggca tctcattttc tgccattcca 
tgacaggcct cggactgaac gctgtagcat 
cgcctatcgg agacgtcgtt gtcgtggcaa 
attttgactc agtgatcgac tgcaatacat 
atcccacctt caccattgag acgacgaccg 
ggcgaggtag aactggcagg ggtaggagtg 
ggccctcggg catgttcgat tcttcggtcc 
ggtatgagct cacgcccgct gagacctcgg 
ggttgcccgt ctgccaggac catctggagt 
acatagatgc ccacttcctg tcccagacta 
tggcatatca agctacagtg tgcgccaggg 
tgtggaagtg tctcatacgg ctgaaaccta 
ggctaggagc cgtccaaaat gaggtcatcc 
catgcatgtc ggctgacctg gaggtcgtca 
ttgcagcttt ggccgcatac tgcctgacga 
tcttgtccgg gaagccagct gtcgttcccg 
agatggaaga gtgtgcctca caacttcctt 
aattcaagca aaaggcgctc gggttgttgc 



tcctgctcct actggcgtta ccaccacgag 2760 
cgtgcggggg tgcggttctt gtaggtctgg 2820 
tgtttctcac taggctcata tggtggttac 2880 
tgcaagtgtg ggtccccccc ctcaacgttc 2940 
cgtgtgcggt tcatccagag ttaatttttg 3000 
gcccgctcat ggtgctccag gctggcataa 3060 
ggctcattcg tgcatgcatg ttagtgcgaa 3120 
tcttcatgaa gctgggcgcg ctgacaggta 3180 
gggactgggc ccacgcgggc ctacgagacc 3240 
ccgccatgga gaccaaggtc atcacctggg 3300 
tcttgggtct acccgtctcc gcccgaaggg 3360 
gtctcgaagg gcaagggtgg cgactccttg 342 0 
ggggcgtact tggttgcatc atcactagcc 3480 
gggaggttca agtggtttct accgcaacac 3540 
tgtgctggac tgtctaccat ggcgctggct 3600 
tcacccaaat gtacaccaat gtagacctgg 3660 
cgcgctccat gacaccatgc agctgtggca 3720 
ctgatgtcat tccggtgcgc cggcgaggcg 3780 
ccgtctccta cctgaaaggc tcctcgggtg 384 0 
tgggcgtctt ccgggctgct gtgtgcaccc 3900 
ccgttgagtc tatggaaact accatgcggt 3960 
cggctgtacc gcagacattc caagtggcac 4020 
gcaccaaagt gccggctgcg tatgcagccc 4080 
ccgttgccgc caccttaggg tttggggcgt 4140 
acatcagaac tggggtaagg accattacca 4200 
gcaagttcct tgccgacggt ggctgttctg 4260 
agtgccactc aactgactcg actaccatct 4320 
agacggctgg agcgcggctc gtcgtgctcg 43 8 0 
tgccacaccc caatatcgag gaaataggcc 444 0 
gcaaagccat ccccattgag gccatcaagg 4500 
agaagaaatg tgacgagctc gccgcaaagc 4560 
attaccgggg ccttgatgtg tccgtcatac 462 0 
cagacgctct aatgacgggt ttcaccggcg 4680 
gtgtcaccca gacagtcgac ttcagcttgg 474 0 
tgccccaaga cgcggtgtcg cgctcgcaac 4800 
gcatctacag gtttgtgact ccaggagaac 4860 
tgtgtgagtg ctatgacgcg ggctgtgctt 4 92 0 
ttaggttgcg ggcttaccta aatacaccag 4980 
tctgggagag cgtcttcaca ggcctcaccc 5040 
aacaggcagg agacaacttt ccttacctgg 510 0 
ctcaagctcc acctccatcg tgggaccaaa 5160 
cactgcacgg gccaacaccc ctgctgtata 522 0 
tcacacaccc cataactaaa tacatcatgg 5280 
ctagcacctg ggtgctggta ggcggagtcc 534 0 
caggcagtgt ggtcattgtg ggcaggatca 5400 
acagggaagt cctctaccag gagttcgatg 5460 
acatcgagca gggaatgcag ctcgccgagc 5520 
aaacggccac caagcaagcg gaggctgctg 5580 
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ctcccgtggt ggagtccaag tggcgagccc ttgagacctt ctgggcgaag cacatgtgga 5540 
atttcatcag cggaatacag tacctagcag gcttatccac tctgcctgga aaccccgcga 5700 
tagcatcatt gatggcattt acagcttcta tcactagccc gctcaccacc caaaacaccc 5760 
tcctgtttaa catcttgggg ggatgggtgg ctgcccaact cgctcctccc agcgctgcgt 5820 
cagctttcgt gggcgccggc atcgccggag cggctgttgg cagcataggc cttgggaagg 5880 
tgctcgtgga catcttggcg ggctatgggg caggggtagc cggcgcactc gtggccttta 5 940 
aggtcatgag cggcgaggtg ccctccaccg aggacctggt caacttactc cctgccatcc 6000 
tctctcctgg tgccctggtc gtcggggtcg tgtgcgcagc aatactgcgt cggcacgtgg 6060 
gcccgggaga gggggctgtg cagtggatga accggctgat agcgttcgct tcgcggggta 512 0 
accacgtctc ccctacgcac tatgtgcctg agagcgacgc tgcagcacgt gtcactcaga 5180 
tcctctctag ccttaccatc actcaactgc tgaagcggct ccaccagtgg attaatgagg 6240 
actgctctac gccatgctcc ggctcgtggc taagggatgt ttgggattgg atatgcacgg 5300 
tgttgactga cttcaagacc tggctccagt ccaaactcct gccgcggtta ccgggagtcc 6360 
ctttcctgtc atgccaacgc gggtacaagg gagtctggcg gggggacggc atcatgcaaa 642 0 
ccacctgccc atgcggagca cagatcgccg gacatgtcaa aaacggttcc atgaggatcg 6480 
tagggcctag aacctgcagc aacacgtggc acggaacgtt ccccatcaac gcatacacca 6540 
cgggaccttg cacaccctcc ccggcgccca actattccag ggcgctatgg cgggtggctg 6600 
ctgaggagta cgtggaggtt acgcgtgtgg gggatttcca ctacgtgacg ggcatgacca 6660 
ctgacaacgt aaagtgccca tgccaggttc cggcccccga attcttcacg gaggtggatg 6720 
gagtgcggtt gcacaggtac gctccggcgt gcaaacctct tctacgggag gacgtcacgt 67 80 
tccaggtcgg gctcaaccaa tacttggtcg ggtcgcagct cccatgcgag cccgaaccgg 6840 
acgtaacagt gcttacttcc atgctcaccg atccctccca cattacagca gagacggcta 6900 
agcgtaggct ggctagaggg tctcccccct ctttagccag ctcatcagct agccagttgt 6960 
ctgcgccttc tttgaaggcg acatgcacta cccaccatga ctccccggac gctgacctca 7020 
tcgaggccaa cctcttgtgg cggcaggaga tgggcggaaa catcactcgc gtggagtcag 7080 
agaataaggt agtaattctg gactctttcg aaccgcttca cgcggagggg gatgagaggg 714 0 
agatatccgt cgcggcggag atcctgcgaa aatccaggaa gttcccctca gcgttgccca 72 0 0 
tatgggcacg cccggactac aatcctccac tgctagagtc ctggaaggac ccggactacg 72 6 0 
tccctccggt ggtacacgga tgcccattgc cacctaccaa ggctcctcca ataccacctc 732 0 
cacggagaaa gaggacggtt gtcctgacag aatccaatgt gtcttctgcc ttggcggagc 7380 
tcgccactaa gaccttcggt agctccggat cgtcggccgt tgatagcggc acggcgaccg 744 0 
cccttcctga cctggcctcc gacgacggtg acaaaggatc cgacgttgag tcgtactcct 7500 
ccatgccccc ccttgaaggg gagccggggg accccgatct cagcgacggg tcttggtcta 7560 
ccgtgagtga ggaggctagt gaggatgtcg tctgctgctc aatgtcctat acgtggacag 7 62 0 
gcgccctgat cacgccatgc gctgcggagg aaagtaagct gcccatcaac ccgttgagca 7 6 80 
actctttgct gcgtcaccac aacatggtct acgccacaac atcccgcagc gcaagcctcc 7740 
ggcagaagaa ggtcaccttt gacagattgc aagtcctgga tgatcattac cgggacgtac 7800 
tcaaggagat gaaggcgaag gcgtccacag ttaaggctaa gcttctatct atagaggagg 7860 
cctgcaagct gacgccccca cattcggcca aatccaaatt tggctatggg gcaaaggacg 792 0 
tccggaacct atccagcagg gccgttaacc acatccgctc cgtgtgggag gacttgctgg 798 0 
aagacactga aacaccaatt gacaccacca tcatggcaaa aagtgaggtt ttctgcgtcc 8040 
aaccagagaa gggaggccgc aagccagctc gccttatcgt attcccagac ctgggagttc 8100 
gtgtatgcga gaagatggcc ctttacgacg tggtctccac ccttcctcag gccgtgatgg 8160 
gctcctcata cggatttcaa tactccccca agcagcgggt cgagttcctg gtgaatacct 822 0 
ggaaatcaaa gaaatgccct atgggcttct catatgacac ccgctgtttt gactcaacgg 8280 
tcactgagag tgacattcgt gttgaggagt caatttacca atgttgtgac ttggcccccg 8340 
aggccagaca ggccataagg tcgctcacag agcggcttta catcgggggt cccctgacta 8400 
actcaaaagg gcagaactgc ggttatcgcc ggtgccgcgc aagtggcgtg ctgacgacta 8460 
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gctgcggtaa taccctcaca tgttacttga 
tccaggactg cacgatgctc gtgaacggag 
gaacccagga ggatgcggcg gccctacgag 
ccccccccgg ggatccgccc caaccagaat 
ccaatgtgtc agtcgcgcac gatgcatctg 
cca.cca.cccc ccttgcacgg gctgcgtggg 
ggctaggcaa tatcatcatg tatgcgccca 
actttttctc catccttcta gctcaagagc 
acggggcttg ctactccatt gagccacttg 
gtcttagcgc atttacactc cacagttact 
gcctcaggaa acttggggta ccacccttgc 
gcgctaagct actgtcccag ggggggaggg 
gggcagtaag gaccaagctt aaactcactc 
ctggctggtt cgtcgctggt tacagcgggg 
gaccccgctg gtttccgttg tgcctactcc 
tccccaaccg atgaacgggg agctaaccac 
tttttttttt tttttttttt tctttttttt 
tttttccctt ctttaatggt ggctccatct 
ccgtgagccg catgactgca gagagtgctg 



aggccactgc agcctgtcga gctgcaaagc 8520 
acgaccttgt cgttatctgt gaaagcgcgg 8580 
ccttcacgga ggctatgact aggtattccg 8 64 0 
acgacctgga gctgataaca tcatgttcct 8700 
gcaaaagggt atactacctc acccgtgacc 8760 
agacagctag acacactcca atcaactctt 8820 
ccctatgggc aaggatgatt ctgatgactc 8 8 80 
aacttgaaaa agccctggat tgtcagatct 8940 
acctacctca gatcattgaa cgactccatg 9000 
ctccaggtga gatcaatagg gtggcttcat 9060 
gaacctggag acatcgggcc agaagtgtcc 912 0 
ccgccacttg tggcagatac ctctttaact 9180 
caatcccggc cgcgtcccag ctggacttgt 9240 
gagacatata tcacagcctg tctcgtgccc 9300 
tactttctgt aggggtaggc atttacctgc 9360 
tccaggcctt aagccatttc ctgttttttt 9420 
tttctttcct ttccttcttt ttttcctttc 9480 
tagccctagt cacggctagc tgtgaaaggt 9540 
atactggcct ctctgcagat catgt 9595 



<210> 5 
<211> 3011 
<212> PRT 

<213> Hepatitis C virus 



<400> 5 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 



Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Ala Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

He Pro Lys Ala Arg Arg Pro Glu Gly Arg Ala Trp Ala Gin Pro Gly 



Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Leu Gly Trp Ala Gly Trp 
85 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
100 105 110 



32 



Arg Arg Arg Ser Arg Asn Leu Gly Lys Val He Asp Thr Leu Thr Cys 
115 120 125 

Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala Pro Leu 
130 135 140 

Gly Gly Ala Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 ISO 155 160 

Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He 
165 IVO 175 

Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr He Pro Ala Ser Ala Tyr 
180 185 190 

Glu Val Arg Asn Val Ser Gly He Tyr His Val Thr Asn Asp Cys Ser 
195 200 205 

Asn Ser Ser He Val Tyr Glu Ala Ala Asp Val He Met His Thr Pro 
210 215 220 

Gly Cys Val Pro Cys Val Gin Glu Gly Asn Ser Ser Arg Cys Trp Val 
225 230 235 240 

Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr 
245 250 255 

Thr He Arg Arg His Val Asp Leu Leu Val Gly Thr Ala Ala Phe Cys 
260 265 270 

Ser Ala Met Tyr Val Gly Asp Leu Cys Gly Ser He Phe Leu Val Ser 
275 280 285 

Gin Leu Phe Thr Phe Ser Pro Arg Arg His Glu Thr Val Gin Asp Cys 
290 295 300 

Asn Cys Ser He Tyr Pro Gly His Val Ser Gly His Arg Met Ala Trp 
305 310 315 320 

Asp Met Met Met Asn Trp Ser Pro Thr Thr Ala Leu Val Val Ser Gin 
325 330 335 

Leu Leu Arg He Pro Gin Ala Val Val Asp Met Val Ala Gly Ala His 
340 345 350 

Trp Gly Val Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp 
355 360 365 
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Ala Lys Val Leu 
370 

Thr His Thr Thr 
385 

Ser Leu Phe Ser 



Asn Gly Ser Trp 
420 

Leu Gin Thr Gly 
435 

Ser Ser Gly Cys 
450 

Phe Ala Gin Gly 
465 

Asp Gin Arg Pro 



Val Pro Ala Ser 
500 

Pro Val Val Val 
515 

Trp Gly Glu Asn 
530 

Pro Gin Gly Asn 
545 

Thr Lys Thr Cys 



Arg Thr Leu lie 
580 

Thr Tyr Thr Lys 
595 

Val Asp Tyr Pro 
610 



lie Val Ala Leu 

375 

Gly Arg Val Ala 
390 

Ser Gly Ala Ser 
405 

His lie Asn Arg 



Phe Phe Ala Ala 
440 

Pro Glu Arg Met 
455 

Trp Gly Pro lie 
470 

Tyr Cys Trp His 
485 

Gin Val Cys Gly 



Gly Thr Thr Asp 
520 

Glu Thr Asp Val 
535 

Trp Phe Gly Cys 
550 

Gly Gly Pro Pro 
565 

Cys Pro Thr Asp 



Cys Gly Ser Gly 
600 

Tyr Arg Leu Trp 
615 



Leu Phe Ala Gly 
380 

Gly His Thr Thr 
395 

Gin Lys lie Gin 
410 

Thr Ala Leu Asn 
425 

Leu Phe Tyr Ala 



Ala Ser Cys Arg 
460 

Thr Tyr Thr Lys 
475 

Tyr Ala Pro Arg 
490 

Pro Val Tyr Cys 
505 

Arg Ser Gly Val 



Met Leu Leu Asn 
540 

Thr Trp Met Asn 
555 

Cys Asn lie Gly 
570 

Cys Phe Arg Lys 
585 

Pro Trp Leu Thr 



His Tyr Pro Cys 
620 



Val Asp Gly Glu 



Ser Gly Phe Thr 
400 

Leu Val Asn Thr 
415 

Cys Asn Asp Ser 
430 

His Lys Phe Asn 
445 

Pro lie Asp Trp 



Pro Asn Ser Ser 
480 

Pro Cys Gly Val 
495 

Phe Thr Pro Ser 
510 

Pro Thr Tyr Ser 
525 

Asn Thr Arg Pro 



Ser Thr Gly Phe 
560 

Gly Val Gly Asn 
575 

His Pro Glu Ala 
590 

Pro Arg Cys Leu 
605 

Thr Leu Asn Phe 
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Ser lie Phe Lys 
625 

Asn Ala Ala Cys 



Arg Asp Arg Ser 
660 

Gin lie Leu Pro 
675 

Leu lie His Leu 
690 

Val Gly Ser Ala 
705 

Leu Leu Phe Leu 



Met Met Leu Leu 
740 

Val Leu Asn Ala 
755 

Leu Val Phe Phe 
770 

Gly Ala Ala Tyr 
785 

Leu Ala Leu Pro 



Ser Cys Gly Gly 
820 

Pro Tyr Tyr Lys 
835 

Phe Leu Thr Arg 
850 

Asn Val Arg Gly 
865 



Val Arg Met Tyr 
630 

Asn Trp Thr Arg 
645 

Glu Leu Ser Pro 



Cys Ala Phe Thr 
680 

His Gin Asn lie 
695 

Phe Val Ser Phe 
710 

Leu Leu Ala Asp 
725 

lie Ala Gin Ala 



Ala Ser Val Ala 
760 

Cys Ala Ala Trp 
775 

Ala Phe Tyr Gly 
790 

Pro Arg Ala Tyr 
805 

Val Val Leu Val 



Arg Tyr lie Ser 
840 

Val Glu Ala Gin 
855 

Gly Arg Asp Ala 
870 



Val Gly Gly Val 
635 

Gly Glu Arg Cys 
650 

Leu Leu Leu Ser 
665 

Thr Leu Pro Ala 



Val Asp Val Gin 
700 

Ala lie Lys Trp 
715 

Ala Arg Val Cys 
730 

Glu Ala Ala Leu 
745 

Gly Ala His Gly 



Tyr lie Lys Gly 
780 

Val Trp Pro Leu 
795 

Ala Leu Asp Thr 
810 

Gly Leu Met Ala 
825 

Trp Cys Met Trp 



Leu His Val Trp 
860 

Val lie Leu Leu 
875 



Glu His Arg Leu 
640 

Asn Leu Glu Asp 
655 

Thr Thr Glu Trp 
670 

Leu Ser Thr Gly 
685 

Tyr Leu Tyr Gly 



Glu Tyr lie Leu 
720 

Ala Cys Leu Trp 
735 

Glu Asn Leu Val 
750 

lie Leu Ser Phe 
765 

Arg Leu Ala Pro 



Leu Leu Leu Leu 
800 

Glu Val Ala Ala 
815 

Leu Thr Leu Ser 
830 

Trp Leu Gin Tyr 
845 

Val Pro Pro Leu 



Met Cys Val Val 
880 
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is Pro Thr Leu Val Phe Asp lie Thr Lys Leu Leu Leu Ala lie Phe 



Gly Pro Leu Trp lie Leu Gin Ala Ser Leu Leu Lys Val Pro Tyr Phe 
900 905 910 

Val Arg Val Gin Gly Leu Leu Arg lie Cys Ala Leu Ala Arg Lys lie 
915 920 925 

Ala Gly Gly His Tyr Val Gin Met Ala lie lie Lys Leu Gly Ala Leu 
930 935 940 

Thr Gly Thr Tyr Val Tyr Asn His Leu Thr Pro Leu Arg Asp Trp Ala 
945 950 955 960 

His Asn Gly Leu Arg Asp Leu Ala Val Ala Val Glu Pro Val Val Phe 
965 970 975 

Ser Arg Met Glu Thr Lys Leu lie Thr Trp Gly Ala Asp Thr Ala Ala 
980 985 990 

Cys Gly Asp lie lie Asn Gly Leu Pro Val Ser Ala Arg Arg Gly Gin 
995 1000 1005 

Glu lie Leu Leu Gly Pro Ala Asp Gly Met Val Ser Lys Gly Trp Arg 
1010 1015 1020 

Leu Leu Ala Pro He Thr Ala Tyr Ala Gin Gin Thr Arg Gly Leu Leu 
1025 1030 1035 1040 

Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu 
1045 1050 1055 

Gly Glu Val Gin He Val Ser Thr Ala Thr Gin Thr Phe Leu Ala Thr 
1060 1065 1070 

Cys He Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Thr Arg 
1075 1080 1085 

Thr He Ala Ser Pro Lys Gly Pro Val He Gin Met Tyr Thr Asn Val 
1090 1095 1100 

Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly Ser Arg Ser Leu 
1105 1110 1115 1120 

Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His 
1125 1130 1135 
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Ala Asp Val lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu 
1140 1145 1150 

Leu Ser Pro Arg Pro lie Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro 
1155 1160 1165 

Leu Leu Cys Pro Ala Gly His Ala Val Gly Leu Phe Arg Ala Ala Val 
1170 1175 1180 

Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe lie Pro Val Glu Asn 
1185 1190 1195 1200 

Leu Gly Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro 
1205 1210 1215 

Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr 
1220 1225 1230 

Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly 
1235 1240 1245 

Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe 
1250 1255 1260 

Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp Pro Asn lie Arg Thr 
1265 1270 1275 1280 

Gly Val Arg Thr lie Thr Thr Gly Ser Pro lie Thr Tyr Ser Thr Tyr 
1285 1290 1295 

Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He 
1300 1305 1310 

He He Cys Asp Glu Cys His Ser Thr Asp Ala Thr Ser He Leu Gly 
1315 1320 1325 

He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val 
1330 1335 1340 

Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Ser His Pro 
1345 1350 1355 1360 

Asn He Glu Glu Val Ala Leu Ser Thr Thr Gly Glu He Pro Phe Tyr 
1365 1370 1375 

Gly Lys Ala He Pro Leu Glu Val He Lys Gly Gly Arg His Leu He 
1380 1385 1390 
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Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val 
1395 1400 1405 



Ala Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser 
1410 1415 1420 

Val He Pro Thr Ser Gly Asp Val Val Val Val Ser Thr Asp Ala Leu 
1425 1430 1435 1440 

Met Thr Gly Phe Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr 
1445 1450 1455 

Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He 
1460 1465 1470 

Glu Thr Thr Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg 
1475 1480 1485 

Gly Arg Thr Gly Arg Gly Lys Pro Gly He Tyr Arg Phe Val Ala Pro 
1490 1495 1500 

Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys 
1505 1510 1515 1520 

Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr 
1525 1530 1535 

Val Arg Leu Arg Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin 
1540 1545 1550 

Asp His Leu Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr His He 
1555 1560 1565 

Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Phe Pro 
1570 1575 1580 

Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro 
1585 1590 1595 1600 

Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro 
1605 1610 1615 

Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin 
1620 1625 1630 

Asn Glu Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Thr Cys 
1635 1640 1645 
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Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly 
1650 1655 1660 

Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val 
1665 1670 1675 1680 

Val He Val Gly Arg He Val Leu Ser Gly Lys Pro Ala He He Pro 
1685 1690 1595 

Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ser 
1700 1705 1710 

Gin His Leu Pro Tyr He Glu Gin Gly Met Met Leu Ala Glu Gin Phe 
1715 1720 1725 

Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg His Ala Glu 
1730 1735 1740 

Val He Thr Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu Val Phe 
1745 1750 1755 1760 

Trp Ala Lys His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala 
1765 1770 1775 

Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala 
1780 1785 1790 

Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Gly Gin Thr Leu Leu 
1795 1800 1805 

Phe Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala Pro Gly 
1810 1815 1820 

Ala Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala He Gly 
1825 1830 1835 1840 

Ser Val Gly Leu Gly Lys Val Leu Val Asp He Leu Ala Gly Tyr Gly 
1845 1850 1855 

Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys He Met Ser Gly Glu 
I860 1865 1870 

Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser 
1875 1880 1885 

Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg 
1890 1895 1900 
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His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He 
1905 1910 1915 1920 

Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro 
1925 1930 1935 

Glu Ser Asp Ala Ala Ala Arg Val Thr Ala He Leu Ser Ser Leu Thr 
1940 1945 1950 

Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp He Ser Ser Glu Cys 
1955 I960 1965 

Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Trp He 
1970 1975 1980 

Cys Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met 
1985 1990 1995 2000 

Pro Gin Leu Pro Gly He Pro Phe Val Ser Cys Gin Arg Gly Tyr Arg 
2005 2010 2015 

Gly Val Trp Arg Gly Asp Gly He Met His Thr Arg Cys His Cys Gly 
2020 2025 2030 

Ala Glu He Thr Gly His Val Lys Asn Gly Thr Met Arg He Val Gly 
2035 2040 2045 

Pro Arg Thr Cys Arg Asn Met Trp Ser Gly Thr Phe Pro He Asn Ala 
2050 2055 2060 

Tyr Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Lys Phe 
2065 2070 2075 2080 

Ala Leu Trp Arg Val Ser Ala Glu Glu Tyr Val Glu He Arg Arg Val 
2085 2090 2095 

Gly Asp Phe His Tyr Val Ser Gly Met Thr Thr Asp Asn Leu Lys Cys 
2100 2105 2110 

Pro Cys Gin He Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly Val 
2115 2120 2125 

Arg Leu His Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu 
2130 2135 2140 

Val Ser Phe Arg Val Gly Leu His Glu Tyr Pro Val Gly Ser Gin Leu 
2145 2150 2155 2160 
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Pro Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr 
2165 2170 2175 



Asp Pro Ser His He Thr Ala Glu Ala Ala Gly Arg Arg Leu Ala Arg 
2180 2185 2190 

Gly Ser Pro Pro Ser Met Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala 
2195 2200 2205 

Pro Ser Leu Lys Ala Thr Cys Thr Ala Asn His Asp Ser Pro Asp Ala 
2210 2215 2220 

Glu Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn 
2225 2230 2235 2240 

He Thr Arg Val Glu Ser Glu Asn Lys Val Val He Leu Asp Ser Phe 
2245 2250 2255 

Asp Pro Leu Val Ala Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ala 
2260 2265 2270 

Glu He Leu Arg Lys Ser Arg Arg Phe Ala Arg Ala Leu Pro Val Trp 
2275 2280 2285 

Ala Arg Pro Asp Tyr Asn Pro Pro Leu Val Glu Thr Trp Lys Lys Pro 
2290 2295 2300 

Asp Tyr Glu Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Pro Arg 
2305 2310 2315 2320 

Ser Pro Pro Val Pro Pro Pro Arg Lys Lys Arg Thr Val Val Leu Thr 
2325 2330 2335 

Glu Ser Thr Leu Ser Thr Ala Leu Ala Glu Leu Ala Thr Lys Ser Phe 
2340 2345 2350 

Gly Ser Ser Ser Thr Ser Gly He Thr Gly Asp Asn Thr Thr Thr Ser 
2355 2360 2365 

Ser Glu Pro Ala Pro Ser Gly Cys Pro Pro Asp Ser Asp Val Glu Ser 
2370 2375 2380 

Tyr Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu 
2385 2390 2395 2400 

Ser Asp Gly Ser Trp Ser Thr Val Ser Ser Gly Ala Asp Thr Glu Asp 
2405 2410 2415 
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Val Val Cys Cys Ser Met Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr 
2420 2425 2430 



Pro Cys Ala Ala Glu Glu Gin Lys Leu Pro He Asn Ala Leu Ser Asn 
2435 2440 2445 

Ser Leu Leu Arg His His Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser 
2450 2455 2460 

Ala Cys Gin Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu 
2465 2470 2475 2480 

Asp Ser His Tyr Gin Asp Val Leu Lys Glu Val Lys Ala Ala Ala Ser 
2485 2490 2495 

Lys Val Lys Ala Asn Leu Leu Ser Val Glu Glu Ala Cys Ser Leu Thr 
2500 2505 2510 

Pro Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val 
2515 2520 2525 

Arg Cys His Ala Arg Lys Ala Val Ala His He Asn Ser Val Trp Lys 
2530 2535 2540 

Asp Leu Leu Glu Asp Ser Val Thr Pro He Asp Thr Thr He Met Ala 
2545 2550 2555 2560 

Lys Asn Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro 
2565 2570 2575 

Ala Arg Leu He Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys 
2580 2585 2590 

Met Ala Leu Tyr Asp Val Val Ser Lys Leu Pro Leu Ala Val Met Gly 
2595 2600 2605 

Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu 
2610 2615 2620 

Val Gin Ala Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp 
2625 2630 2635 2640 

Thr Arg Cys Phe Asp Ser Thr Val Thr Glu Ser Asp He Arg Thr Glu 
2645 2650 2655 

Glu Ala He Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala Arg Val Ala 
2660 2665 2670 
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He Lys Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn 
2675 2680 2685 



Ser Arg Gly Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val 
2690 2695 2700 

Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr lie Lys Ala Arg 
2705 2710 2715 2720 

Ala Ala Cys Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val Cys 
2725 2730 2735 

Gly Asp Asp Leu Val Val He Cys Glu Ser Ala Gly Val Gin Glu Asp 
2740 2745 2750 

Ala Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala 
2755 2760 2765 

Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu lie Thr 
2770 2775 2780 

Ser Cys Ser Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys Arg 
2785 2790 2795 2800 

Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala 
2805 2810 2815 

Trp Glu Thr Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn He 
2820 2825 2830 

He Met Phe Ala Pro Thr Leu Trp Ala Arg Met He Leu Met Thr His 
2835 2840 2845 

Phe Phe Ser Val Leu He Ala Arg Asp Gin Leu Glu Gin Ala Leu Asn 
2850 2855 2860 

Cys Glu He Tyr Gly Ala Cys Tyr Ser He Glu Pro Leu Asp Leu Pro 
2865 2870 2875 2880 

Pro He He Gin Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser 
2885 2890 2895 

Tyr Ser Pro Gly Glu He Asn Arg Val Ala Ala Cys Leu Arg Lys Leu 
2900 2905 2910 

Gly Val Pro Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg 
2915 2920 2925 
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Ala Arg Leu Leu Ser Arg Gly Gly Arg Ala Ala He Cys Gly Lys Tyr 
2930 2935 2940 



Leu Phe Asn Trp Ala Val Arg Thr Lys Leu Lys Leu Thr Pro He Ala 
2945 2950 2955 2960 

Ala Ala Gly Arg Leu Asp Leu Ser Gly Trp Phe Thr Ala Gly Tyr Ser 
2965 2970 2975 

Gly Gly Asp He Tyr His Ser Val Ser His Ala Arg Pro Arg Trp Phe 
2980 2985 2990 

Trp Phe Cys Leu Leu Leu Leu Ala Ala Gly Val Gly He Tyr Leu Leu 
2995 3000 3005 

Pro Asn Arg 
3010 



<210> 6 
<211> 9599 
<212> DNA 

<213> Hepatitis C virus 
<400> 6 

gccagccccc tgatgggggc gacactccac 
tcttcacgca gaaagcgtct agccatggcg 
cccccctccc gggagagcca tagtggtctg 
gacgaccggg tcctttcttg gatcaacccg 
gcgagactgc tagccgagta gtgttgggtc 
gtgcttgcga gtgccccggg aggtctcgta 
ctcaaagaaa aaccaaacgt aacaccaacc 
gtggtcagat cgttggtgga gtttacctgt 
gcgcgactag gaaggcttcc gagcggtcgc 
aggctcgccg acccgagggc agggcctggg 
gcaatgaggg cctggggtgg gcaggatggc 
ggggccccac ggacccccgg cgtaggtcgc 
catgcggctt cgccgatctc atggggtaca 
ctgccagggc cttggcacac ggtgtccggg 
ggaacttgcc cggttgctct ttctctatct 
tcccagcttc cgcttatgaa gtgcgcaacg 
gctccaactc aagcattgtg tatgaggcag 
tgccctgtgt tcaggagggt aacagctccc 
cggccaggaa tgccagcgtc cccactacga 
ggacggctgc tttctgctcc gctatgtacg 
tctcccagct gttcaccttc tcgcctcgcc 
caatctatcc cggccatgta tcaggtcacc 
cacctacaac agccctagtg gtgtcgcagt 
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catgaatcac tcccctgtga ggaactactg 60 
ttagtatgag tgtcgtgcag cctccaggac 12 0 
cggaaccggt gagtacaccg gaattgccag 180 
ctcaatgcct ggagatttgg gcgtgccccc 24 0 
gcgaaaggcc ttgtggtact gcctgatagg 3 00 
gaccgtgcac catgagcacg aatcctaaac 3 60 
gccgcccaca ggacgtcaag ttcccgggcg 42 0 
tgccgcgcag gggccccagg ttgggtgtgc 4 80 
aacctcgtgg aaggcgacaa cctatcccaa 540 
ctcagcccgg gtacccttgg cccctctatg 600 
tcctgtcacc ccgcggctcc cggcctagtt 660 
gtaacttggg taaggtcatc gataccctta 72 0 
ttccgctcgt cggcgccccc ctagggggcg 7 80 
ttctggagga cggcgtgaac tatgcaacag 84 0 
tcctcttggc tctgctgtcc tgtttgacca 900 
tgtccgggat ataccatgtc acgaacgact 960 
cggacgtgat catgcatact cccgggtgcg 1020 
gttgctgggt agcgctcact cccacgctcg 1080 
caatacgacg ccacgtcgac ttgctcgttg 114 0 
tgggggatct ctgcggatct attttcctcg 12 00 
ggcatgagac agtgcaggac tgcaactgct 12 60 
gcatggcttg ggatatgatg atgaactggt 132 0 
tgctccggat cccacaagct gtcgtggaca 13 80 



tggtggcggg ggcccactgg ggagtcctgg cgggccttgc ctactattcc atggtaggga 144 0 
actgggctaa ggttctgatt gtggcgctac tctttgccgg cgttgacggg gagacccaca 1500 
cgacggggag ggtggccggc cacaccacct ccgggttcac gtcccttttc tcatctgggg 1560 
cgtctcagaa aatccagctt gtgaatacca acggcagctg gcacatcaac aggactgccc 1620 
taaattgcaa tgactccctc caaactgggt tctttgccgc gctgttttac gcacacaagt 1680 
tcaactcgtc cgggtgcccg gagcgcatgg ccagctgccg ccccattgac tggttcgccc 1740 
aggggtgggg ccccatcacc tatactaagc ctaacagctc ggatcagagg ccttattgct 18 0 0 
ggcattacgc gcctcgaccg tgtggtgtcg tacccgcgtc gcaggtgtgt ggtccagtgt 1860 
attgtttcac cccaagccct gttgtggtgg ggaccaccga tcgttccggt gtccctacgt 192 0 
atagctgggg ggagaatgag acagacgtga tgctcctcaa caacacgcgt ccgccacaag 1980 
gcaactggtt cggctgtaca tggatgaata gtactgggtt cactaagacg tgcggaggtc 2 04 0 
ccccgtgtaa catcgggggg gtcggtaacc gcaccttgat ctgccccacg gactgcttcc 2100 
ggaagcaccc cgaggctact tacacaaaat gtggctcggg gccctggttg acacctaggt 2160 
gcctagtaga ctacccatac aggctttggc actacccctg cactctcaat ttttccatct 2220 
ttaaggttag gatgtatgtg gggggcgtgg agcacaggct caatgccgca tgcaattgga 2280 
ctcgaggaga gcgctgtaac ttggaggaca gggataggtc agaactcagc ccgctgctgc 2340 
tgtctacaac agagtggcag atactgccct gtgctttcac caccctaccg gctttatcca 2400 
ctggtttgat ccatctccat cagaacatcg tggacgtgca atacctgtac ggtgtagggt 2460 
cagcgtttgt ctcctttgca atcaaatggg agtacatcct gttgcttttc cttctcctgg 2520 
cagacgcgcg cgtgtgtgcc tgcttgtgga tgatgctgct gatagcccag gctgaggccg 2580 
ccttagagaa cttggtggtc ctcaatgcgg cgtccgtggc cggagcgcat ggtattctct 2640 
cctttcttgt gttcttctgc gccgcctggt acattaaggg caggctggct cctggggcgg 2700 
cgtatgcttt ttatggcgta tggccgctgc tcctgctcct actggcgtta ccaccacgag 2760 
catatgcact ggacacggag gtggccgcgt cgtgtggcgg cgttgttctt gtcgggttaa 282 0 
tggcgctgac tctgtcgcca tattacaagc gctatatcag ctggtgcatg tggtggcttc 28 8 0 
agtattttct gaccagagta gaagcgcaac tgcacgtgtg ggttcccccc ctcaacgtcc 2940 
ggggggggcg cgatgccgtc atcttactca tgtgtgtagt acacccgacc ctggtatttg 3000 
acatcaccaa actactcctg gccatcttcg gacccctttg gattcttcaa gccagtttgc 3 060 
ttaaagtccc ctacttcgtg cgcgttcaag gccttctccg gatctgcgcg ctagcgcgga 312 0 
agatagccgg aggtcattac gtgcaaatgg ccatcatcaa gttaggggcg cttactggca 318 0 
cctatgtgta taaccatctc acccctcttc gagactgggc gcacaacggc ctgcgagatc 3 24 0 
tggccgtggc tgtggaacca gtcgtcttct cccgaatgga gaccaagctc atcacgtggg 3300 
gggcagatac cgccgcgtgc ggtgacatca tcaacggctt gcccgtctct gcccgtaggg 3360 
gccaggagat actgcttggg ccagccgacg gaatggtctc caaggggtgg aggttgctgg 3420 
cgcccatcac ggcgtacgcc cagcagacga gaggcctcct agggtgtata atcaccagcc 3480 
tgactggccg ggacaaaaac caagtggagg gtgaggtcca gatcgtgtca actgctaccc 3540 
aaaccttcct ggcaacgtgc atcaatgggg tatgctggac tgtctaccac ggggccggaa 3600 
cgaggaccat cgcatcaccc aagggtcctg tcatccagat gtataccaat gtggaccaag 3 660 
accttgtggg ctggcccgct cctcaaggtt cccgctcatt gacaccctgt acctgcggct 3720 
cctcggacct ttacctggtc acgaggcacg ccgatgtcat tcccgtgcgc cggcgaggtg 3780 
atagcagggg tagcctgctt tcgccccggc ccatttccta cttgaaaggc tcctcggggg 3 84 0 
gtccgctgtt gtgccccgcg ggacacgccg tgggcctatt cagggccgcg gtgtgcaccc 3900 
gtggagtggc taaagcggtg gactttatcc ctgtggagaa cctagggaca accatgagat 3 960 
ccccggtgtt cacggacaac tcctctccac cagcagtgcc ccagagcttc caggtggccc 402 0 
acctgcatgc tcccaccggc agcggtaaga gcaccaaggt cccggctgcg tacgcagccc 4080 
agggctacaa ggtgttggtg ctcaacccct ctgttgctgc aacgctgggc tttggtgctt 414 0 
acatgtccaa ggcccatggg gttgatccta atatcaggac cggggtgaga acaattacca 4200 
ctggcagccc catcacgtac tccacctacg gcaagttcct tgccgacggc gggtgctcag 4260 
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gaggtgctta tgacataata atttgtgacg agtgccactc cacggatgcc acatccatct 4320 
tgggcatcgg cactgtcctt gaccaagcag agactgcggg ggcgagactg gttgtgctcg 43 80 
ccactgctac ccctccgggc tccgtcactg tgtcccatcc taacatcgag gaggttgctc 4440 
tgtccaccac cggagagatc cccttttacg gcaaggctat ccccctcgag gtgatcaagg 45 00 
ggggaagaca tctcatcttc tgccactcaa agaagaagtg cgacgagctc gccgcgaagc 4560 
tggtcgcatt gggcatcaat gccgtggcct actaccgcgg tcttgacgtg tctgtcatcc 4620 
cgaccagcgg cgatgttgtc gtcgtgtcga ccgatgctct catgactggc tttaccggcg 4680 
acttcgactc tgtgatagac tgcaacacgt gtgtcactca gacagtcgat ttcagccttg 4740 
accctacctt taccattgag acaaccacgc tcccccagga tgctgtctcc aggactcaac 4800 
gccggggcag gactggcagg gggaagccag gcatctatag atttgtggca ccgggggagc 4 860 
gcccctccgg catgttcgac tcgtccgtcc tctgtgagtg ctatgacgcg ggctgtgctt 4920 
ggtatgagct cacgcccgcc gagactacag ttaggctacg agcgtacatg aacaccccgg 4980 
ggcttcccgt gtgccaggac catcttgaat tttgggaggg cgtctttacg ggcctcactc 5040 
atatagatgc ccacttttta tcccagacaa agcagagtgg ggagaacttt ccttacctgg 5100 
tagcgtacca agccaccgtg tgcgctaggg ctcaagcccc tcccccatcg tgggaccaga 5150 
tgtggaagtg tttgatccgc cttaaaccca ccctccatgg gccaacaccc ctgctataca 5220 
gactgggcgc tgttcagaat gaagtcaccc tgacgcaccc aatcaccaaa tacatcatga 5280 
catgcatgtc ggccgacctg gaggtcgtca cgagcacctg ggtgctcgtt ggcggcgtcc 5340 
tggctgctct ggccgcgtat tgcctgtcaa caggctgcgt ggtcatagtg ggcaggatcg 5400 
tcttgtccgg gaagccggca attatacctg acagggaggt tctctaccag gagttcgatg 5460 
agatggaaga gtgctctcag cacttaccgt acatcgagca agggatgatg ctcgctgagc 552 0 
agttcaagca gaaggccctc ggcctcctgc agaccgcgtc ccgccatgca gaggttatca 5580 
cccctgctgt ccagaccaac tggcagaaac tcgaggtctt ttgggcgaag cacatgtgga 564 0 
atttcatcag tgggatacaa tacttggcgg gcctgtcaac gctgcctggt aaccccgcca 5700 
ttgcttcatt gatggctttt acagctgccg tcaccagccc actaaccact ggccaaaccc 5760 
tcctcttcaa catattgggg gggtgggtgg ctgcccagct cgccgccccc ggtgccgcta 5820 
ctgcctttgt gggtgctggc ctagctggcg ccgccatcgg cagcgttgga ctggggaagg 5 88 0 
tcctcgtgga cattcttgca gggtatggcg cgggcgtggc gggagctctt gtagcattca 5 94 0 
agatcatgag cggtgaggtc ccctccacgg aggacctggt caatctgctg cccgccatcc 600 0 
tctcgcctgg agcccttgta gtcggtgtgg tctgcgcagc aatactgcgc cggcacgttg 6060 
gcccgggcga gggggcagtg caatggatga accggctaat agccttcgcc tcccggggga 612 0 
accatgtttc ccccacgcac tacgtgccgg agagcgatgc agccgcccgc gtcactgcca 6180 
tactcagcag cctcactgta acccagctcc tgaggcgact gcatcagtgg ataagctcgg 6240 
agtgtaccac tccatgctcc ggttcctggc taagggacat ctgggactgg atatgcgagg 63 00 
tgctgagcga ctttaagacc tggctgaaag ccaagctcat gccacaactg cctgggattc 63 60 
cctttgtgtc ctgccagcgc gggtataggg gggtctggcg aggagacggc attatgcaca 6420 
ctcgctgcca ctgtggagct gagatcactg gacatgtcaa aaacgggacg atgaggatcg 64 80 
tcggtcctag gacctgcagg aacatgtgga gtgggacgtt ccccattaac gcctacacca 654 0 
cgggcccctg tactcccctt cctgcgccga actataagtt cgcgctgtgg agggtgtctg 660 0 
cagaggaata cgtggagata aggcgggtgg gggacttcca ctacgtatcg ggtatgacta 6660 
ctgacaatct taaatgcccg tgccagatcc catcgcccga atttttcaca gaattggacg 6720 
gggtgcgcct acacaggttt gcgccccctt gcaagccctt gctgcgggag gaggtatcat 6780 
tcagagtagg actccacgag tacccggtgg ggtcgcaatt accttgcgag cccgaaccgg 6840 
acgtagccgt gttgacgtcc atgctcactg atccctccca tataacagca gaggcggccg 690 0 
ggagaaggtt ggcgagaggg tcaccccctt ctatggccag ctcctcggct agccagctgt 6960 
ccgctccatc tctcaaggca acttgcaccg ccaaccatga ctcccctgac gccgagctca 7 02 0 
tagaggctaa cctcctgtgg aggcaggaga tgggcggcaa catcaccagg gttgagtcag 7080 
agaacaaagt ggtgattctg gactccttcg atccgcttgt ggcagaggag gatgagcggg 7140 



46 



aggtctccgt acctgcagaa attctgcgga agtctcggag attcgcccgg gccctgcccg 72 00 
tctgggcgcg gccggactac aaccccccgc tagtagagac gtggaaaaag cctgactacg 7260 
aaccacctgt ggtccatggc tgcccgctac cacctccacg gtcccctcct gtgcctccgc 7320 
ctcggaaaaa gcgtacggtg gtcctcaccg aatcaaccct atctactgcc ttggccgagc 73 80 
ttgccaccaa aagttttggc agctcctcaa cttccggcat tacgggcgac aatacgacaa 7440 
catcctctga gcccgcccct tctggctgcc cccccgactc cgacgttgag tcctattctt 7500 
ccatgccccc cctggagggg gagcctgggg atccggatct cagcgacggg tcatggtcga 7560 
cggtcagtag tggggccgac acggaagatg tcgtgtgctg ctcaatgtct tattcctgga 7620 
caggcgcact cgtcaccccg tgcgctgcgg aagaacaaaa actgcccatc aacgcactga 7680 
gcaactcgtt gctacgccat cacaatctgg tgtattccac cacttcacgc agtgcttgcc 7740 
aaaggcagaa gaaagtcaca tttgacagac tgcaagttct ggacagccat taccaggacg 7800 
tgctcaagga ggtcaaagca gcggcgtcaa aagtgaaggc taacttgcta tccgtagagg 786 0 
aagcttgcag cctgacgccc ccacattcag ccaaatccaa gtttggctat ggggcaaaag 7920 
acgtccgttg ccatgccaga aaggccgtag cccacatcaa ctccgtgtgg aaagaccttc 7980 
tggaagacag tgtaacacca atagacacta ccatcatggc caagaacgag gttttctgcg 8040 
ttcagcctga gaaggggggt cgtaagccag ctcgtctcat cgtgttcccc gacctgggcg 810 0 
tgcgcgtgtg cgagaagatg gccctgtacg acgtggttag caagctcccc ctggccgtga 8160 
tgggaagctc ctacggattc caatactcac caggacagcg ggttgaattc ctcgtgcaag 822 0 
cgtggaagtc caagaagacc ccgatggggt tctcgtatga tacccgctgt tttgactcca 82 80 
cagtcactga gagcgacatc cgtacggagg aggcaattta ccaatgttgt gacctggacc 8340 
cccaagcccg cgtggccatc aagtccctca ctgagaggct ttatgttggg ggccctctta 84 00 
ccaattcaag gggggaaaac tgcggctacc gcaggtgccg cgcgagcggc gtactgacaa 84 5 0 
ctagctgtgg taacaccctc acttgctaca tcaaggcccg ggcagcctgt cgagccgcag 8520 
ggctccagga ctgcaccatg ctcgtgtgtg gcgacgactt agtcgttatc tgtgaaagtg 85 8 0 
cgggggtcca ggaggacgcg gcgagcctga gagccttcac ggaggctatg accaggtact 8 64 0 
ccgccccccc cggggacccc ccacaaccag aatacgactt ggagcttata acatcatgct 8700 
cctccaacgt gtcagtcgcc cacgacggcg ctggaaagag ggtctactac cttacccgtg 87 60 
accctacaac ccccctcgcg agagccgcgt gggagacagc aagacacact ccagtcaatt 8820 
cctggctagg caacataatc atgtttgccc ccacactgtg ggcgaggatg atactgatga 8880 
cccatttctt tagcgtcctc atagccaggg atcagcttga acaggctctt aactgtgaga 894 0 
tctacggagc ctgctactcc atagaaccac tggatctacc tccaatcatt caaagactcc 9000 
atggcctcag cgcattttca ctccacagtt actctccagg tgaaatcaat agggtggccg 9 06 0 
catgcctcag aaaacttggg gtcccgccct tgcgagcttg gagacaccgg gcccggagcg 912 0 
tccgcgctag gcttctgtcc agaggaggca gggctgctat atgtggcaag tacctcttca 9180 
actgggcagt aagaacaaag ctcaaactca ctccaatagc ggccgctggc cggctggact 9240 
tgtccggttg gttcacggct ggctacagcg ggggagacat ttatcacagc gtgtctcatg 93 00 
cccggccccg ctggttctgg ttttgcctac tcctgctcgc tgcaggggta ggcatctacc 93 60 
tcctccccaa ccgatgaagg ttggggtaaa cactccggcc tcttaagcca tttcctgttt 9420 
tttttttttt tttttttttt tttttctttt tttttttctt tcctttcctt ctttttttcc 9480 
tttctttttc ccttctttaa tggtggctcc atcttagccc tagtcacggc tagctgtgaa 9540 
aggtccgtga gccgcatgac tgcagagagt gctgatactg gcctctctgc agatcatgt 9599 



<210> 7 
<211> 31 
<212> DNA 

<213> Hepatitis C virus 
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ggctacagcg gggggagaca tttatcacag c 



<210> 8 
<211> 30 
<212> DNA 

<213> Hepatitis C virus 
<400> 8 

tcatgcggct cacggacctt tcacagctag 



<210> 9 
<211> 38 
<212> DNA 

<213> Hepatitis C virus 
<400> 9 

gtccaagctt atcacagcgt gtctcatgcc cggccccg 



<210> 10 
<211> 39 
<212> DNA 

<213> Hepatitis C virus 
<400> 10 

cgtctctaga ggacctttca cagctagccg tgactaggg 



<210> 11 
<211> 38 
<212> DNA 

<213> Hepatitis C virus 
<400> 11 

tgaaggttgg ggtaaacact ccggcctctt aggccatt 



<210> 12 
<211> 35 
<212> DNA 

<213> Hepatitis C virus 
<400> 12 

acatgatctg cagagaggcc agtatcagca ctctc 



<210> 13 
<211> 46 
<212> DNA 

<213> Hepatitis C virus 
<400> 13 

gtccaagctt acgcgtaaac actccggcct ccttaagcca ttcctg 



<210> 14 
<211> 47 
<212> DNA 

<213> Hepatitis C virus 
<400> 14 

cgtctctaga catgatctgc agagaggcca gtatcagcac tctctgc 



<210> 15 
<211> 67 
<212> DNA 

<213> Hepatitis C virus 
<400> 15 

ttttttttgc ggccgctaat acgactcact atagccagcc ccctgatggg ggcgacactc 60 
caccatg 



<210> 16 
<211> 30 
<212> DNA 

<213> Hepatitis C virus 
<400> 16 

actgtcttca cgcagaaagc gtctagccat 



<210> 17 
<211> 43 
<212> DNA 

<213> Hepatitis C virus 
<400> 17 

cgtctctaga caggaaatgg cttaagaggc cggagtgttt acc 



<210> 18 
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<211> 26 
<212> DNA 

<213> Hepatitis C virus 
<400> 18 

gcctattggc ctggagtggt tagctc 



<210> 19 
<211> 40 
<212> DNA 

<213> Hepatitis C virus 
<400> 19 

aggatggcct taaggcctgg agtggttagc tccccgttca 



<210> 20 
<211> 39 
<212> DNA 

<213> Hepatitis C virus 
<400> 20 

cgtcatcgat cctcagcggg catatgcact ggacacgga 



<210> 21 
<211> 32 
<212> DNA 

<213> Hepatitis C virus 
<400> 21 

catgcaccag ctgatatagc gcttgtaata tg 



<210> 22 
<211> 30 
<212> DNA 

<213> Hepatitis C virus 
<400> 22 

tccgtagagg aagcttgcag cctgacgccc 



<210> 23 
<211> 34 
<212> DNA 

<213> Hepatitis C virus 
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<400> 23 

gtacttgcca catatagcag ccctgcctcc tctg 



<210> 24 
<211> 34 
<212> DNA 

<213> Hepatitis C virus 
<400> 24 

cagaggaggc agggctgcta tatgtggcaa gtac 



<210> 25 
<211> 43 
<212> DNA 

<213> Hepatitis C virus 
<400> 25 

cgtctctaga caggaaatgg cttaagaggc cggagtgttt acc 



<210> 26 
<211> 36 
<212> DNA 

<213> Hepatitis C virus 
<400> 26 

tgcaattgga ctcgaggaga gcgctgtaac ttggag 



<210> 27 
<211> 35 
<212> DNA 

<213> Hepatitis C virus 
<400> 27 

cggtccaagg catatgctcg tggtggtaac gccag 



<210> 28 
<211> 35 
<212> PRT 

<213> Hepatitis C ■ 



<400> 28 

Ala Gly Val Asp Gly Glu Thr His Thr Thr Gly Arg Val Ala Gly His 
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Thr Thr Ser Gly Phe Thr Ser Leu Phe Ser Ser Gly Ala Ser Gin Lys 
20 25 30 



lie Gin Leu 
35 



<210> 29 
<211> 19 
<212> PRT 

<213> Hepatitis C virus 
<400> 29 

Gly Trp Gly Pro He Thr Tyr Thr Lys Pro Asn Ser Ser Asp Gin Arg 



Pro Tyr Cys 



<210> 30 
<211> 35 
<212> PRT 

<213> Hepatitis C virus 
<400> 30 

Ala Gly Val Asp Gly Glu Thr His Thr Thr Gly Arg Val Ala Gly His 
15 10 15 

Thr Thr Ser Arg Phe Thr Ser Leu Phe Ser Ser Gly Ala Ser Gin Lys 
20 25 30 



He Gin Leu 
35 



<210> 31 
<211> 19 
<212> PRT 

<213> Hepatitis C virus 
<400> 31 

Gly Trp Gly Pro He Thr Tyr Thr Glu Pro Asn Ser Ser Asp Gin Arg 
15 10 15 

Pro Tyr Cys 

52 



<210> 32 
<211> 35 
<212> PRT 

<213> Hepatitis C virus 
<400> 32 

Ala Gly Val Asp Gly Glu Thr His 
1 5 

Thr Thr Ser Gly Phe Thr Ser Leu 
20 

lie Gin Leu 
35 



Thr Thr Gly Arg Val Val Gly His 
10 15 

Phe Ser Ser Gly Ala Ser Gin Lys 
25 30 



<210> 33 
<211> 19 
<212> PRT 

<213> Hepatitis C virus 
<400> 33 

Gly Trp Gly Pro lie Thr Tyr Thr Gly Pro Asn Ser Ser Asp Gin Arg 



<210> 34 
<211> 35 
<212> PRT 

<213> Hepatitis C virus 
<400> 34 

Ala Gly Val Asp Gly Glu Thr His Thr Thr Gly Arg Val Val Gly Arg 
15 10 15 

Thr Thr Ser Gly Phe Thr Ser Leu Phe Ser Ser Gly Ala Ser Gin Lys 
20 25 30 

lie Gin Leu 
35 
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<210> 35 
<211> 19 
<212> PRT 

<213> Hepatitis C virus 
<400> 35 

Gly Trp Gly Pro lie Ala Tyr Thr Glu Pro Asn Ser Ser Asp Gin Arg 
15 10 15 

Pro Tyr Cys 



<210> 36 
<211> 35 
<212> PRT 

<213> Hepatitis C virus 
<400> 36 

Ala Gly Val Asp Gly Thr Thr Tyr Thr Ser Gly Gly Val Ala Gly Arg 
15 10 15 

Thr Thr Ser Gly Phe Thr Ser Leu Phe Ser Pro Gly Ala Ser Gin Lys 
20 25 30 

lie Gin Leu 
35 



<210> 37 
<211> 35 
<212> PRT 

<213> Hepatitis C virus 
<400> 37 

Thr Gly Val Asp Gly Thr Thr Tyr Thr Ser Gly Gly Ala Ala Gly Arg 
15 10 15 

Thr Thr Ser Gly Phe Thr Ser Leu Phe Ser Ser Gly Ala Ser Gin Lys 
20 25 30 

lie Gin Leu 
35 



<210> 38 
<211> 35 
<212> PRT 
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<213> Hepatitis C virus 



<400> 38 

Thr Gly Val Asp Gly Thr Thr Tyr 
1 5 

Thr Thr Ser Gly Phe Thr Ser Leu 
20 

He Gin Leu 
35 



Thr Ser Gly Gly Val Ala Gly Arg 
10 15 

Phe Ser Ser Gly Ala Ser Gin Lys 
25 30 



<210> 39 
<211> 19 
<212> PRT 

<213> Hepatitis C virus 
<400> 39 

Gly Trp Gly Pro He Thr His Thr Glu Pro Asn Ser Ser Asp Gin Arg 



Pro Tyr Cys 



<210> 40 
<211> 19 
<212> PRT 

<213> Hepatitis C virus 
<400> 40 

Gly Trp Gly Pro He Thr Tyr Thr Gly Pro Asp Ser Leu Asp Gin Arg 



Pro Tyr Cys 



<210> 41 
<211> 35 
<212> PRT 

<213> Hepatitis C virus 



<400> 41 

Ala Gly Val Asp Gly Ala Thr Tyr Thr Ser Gly Gly Val Ala Gly Arg 
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Thr Thr Ser Gly Phe Thr Ser Leu Phe Ser Ser Gly Ala Ser Gin Lys 
20 25 30 



He Gin Leu 
35 



<210> 42 
<211> 19 
<212> PRT 

<213> Hepatitis C virus 
<400> 42 

Gly Trp Gly Pro He Thr Tyr Thr Glu Pro Asn Ser Pro Asp Gin Arg 



Pro Tyr Cys 



<210> 43 
<211> 35 
<212> PRT 

<213> Hepatitis C virus 
<400> 43 

Ala Gly Val Asp Gly Lys Thr Tyr 
1 5 

Thr Thr Ser Arg Phe Thr Ser Leu 
20 

He Gin Leu 
35 



Thr Ser Gly Gly Ala Ala Ser His 
10 15 

Phe Ser Pro Gly Ala Ser Gin Arg 
25 30 



<210> 44 
<211> 19 
<212> PRT 

<213> Hepatitis C virus 
<400> 44 

Gly Trp Gly Pro He Thr Tyr Thr Glu Ser Gly Ser Arg Asp Gin Arg 



Pro Tyr Cys 
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<210> 45 
<211> 35 
<212> PRT 

<213> Hepatitis C virus 
<400> 45 

Ala Gly Val Asp Gly Glu Thr Tyr 
1 5 

Thr Thr Ser Thr Leu Ala Ser Leu 
20 

lie Gin Leu 
35 



Thr Ser Gly Gly Ala Ala Ser His 
10 15 

Phe Ser Pro Gly Ala Ser Gin Arg 
25 30 



<210> 46 
<211> 19 
' <212> PRT 
<213> Hepatitis C virus 

<400> 46 

Gly Trp Gly Pro He Thr Tyr Thr Glu Pro Asp Ser Pro Asp Gin Arg 



Pro Tyr Cys 



<210> 47 
<211> 341 
<212> DNA 

<213> Hepatitis C virus 
<400> 47 

gccagccccc gattgggggc gacactccac 
tcttcacgca gaaagcgtct agccatggcg 
cccccctccc gggagagcca tagtggtctg 
gacgaccggg tcctttcttg gatcaacccg 
gcgagactgc tagccgagta gtgttgggtc 
gtgcttgcga gtgccccggg aggtctcgta 



catagatcac tcccctgtga ggaactactg 6 0 
ttagtatgag tgtcgtgcag cctccaggac 12 0 
cggaaccggt gagtacaccg gaattgccag 18 0 
ctcaatgcct ggagatttgg gcgtgccccc 24 0 
gcgaaaggcc ttgtggtact gcctgatagg 300 
gaccgtgcac c 341 



<210> 48 
<211> 341 
<212> DNA 

<213> Hepatitis C virus 
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<400> 48 

gccagccccc tgatgggggc gacactccac catgaatcac tcccctgtga ggaactactg 60 
tcttcacgca gaaagcgtct agccatggcg ttagtatgag tgtcgtgcag cctccaggac 120 
cccccctccc gggagagcca tagtggtctg cggaaccggt gagtacaccg gaattgccag 180 
gacgaccggg tcctttcttg gatcaacccg ctcaatgcct ggagatttgg gcgtgccccc 240 
gcgagactgc tagccgagta gtgttgggtc gcgaaaggcc ttgtggtact gcctgatagg 3 00 
gtgcttgcga gtgccccggg aggtctcgta gaccgtgcac c 341 



<210> 49 
<211> 341 
<212> DNA 

<213> Hepatitis C virus 
<400> 49 

gccagccccc tgatgggggc gacactccac catgaatcac tcccctgtga ggaactactg 6 0 
tcttcacgca gaaagcgtct agccatggcg ttagtatgag tgtcgtgcag cctccaggac 12 0 
cccccctccc gggagagcca tagtggtctg cggaaccggt gagtacaccg gaattgccag 180 
gacgaccggg tcctttcttg gataaacccg ctcaatgcct ggagatttgg gcgtgccccc 240 
gcaagactgc tagccgagta gtgttgggtc gcgaaaggcc ttgtggtact gcctgatagg 300 
gtgcttgcga gtgccccggg aggtctcgta gaccgtgcac c 341 



<210> 50 
<211> 41 
<212> DNA 

<213> Hepatitis C virus 
<400> 50 

tgaacgggga gctaaccact ccaggccaat aggccttcct g 41 

<210> 51 
<211> 42 
<212> DNA 

<213> Hepatitis C virus 
<400> 51 

tgaacgggga gctaaccact ccaggcctta agccatttcc tg 42 

<210> 52 
<211> 43 
<212> DNA 

<213> Hepatitis C virus 
<400> 52 
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tgaaggttgg ggtaaacact ccggcctctt aagccatttc ctg 



43 



<210> 53 
<211> 16 
<212> DNA 

<213> Hepatitis C virus 
<400> 53 

ggtggctcca tcttag 



<210> 54 
<211> 19 
<212> DNA 

<213> Hepatitis C virus 
<400> 54 

aatggtggct ccatcttag 



<210> 55 
<211> 82 
<212> DNA 

<213> Hepatitis C virus 



<400> 55 

ccctagtcac ggctagctgt gaaaggtccg tgagccgcat gactgcagag agtgctgata 60 
ctggcctctc tgcagatcat gt 



<210> 56 
<211> 40 
<212> DNA 

<213> Hepatitis C virus 
<400> 56 

aggttggggt aaacactccg gcctcttaag ccatttcctg 



<210> 57 
<211> 81 
<212> DNA 

<213> Hepatitis C virus 
<400> 57 

tttttttttt tttttttttt ttttttttct tttttttttt ctttcctttc cttctttttt 60 
tcctttcttt ttcccttctt t 
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<210> 58 
<211> 101 
<212> DNA 

<213> Hepatitis C virus 
<400> 58 

aatggtggct ccatcttagc cctagtcacg gctagctgtg aaaggtccgt gagccgcatg 60 
actgcagaga gtgctgatac tggcctctct gcagatcatg t 101 



<210> 59 
<211> 127 
<212> DNA 

<213> Hepatitis C virus 
<400> 59 

tgaaggttgg ggtaaacact ccggcctctt 
tttttttttt tctttttttt tttctttcct 
ctttaat 



aagccatttc ctgttttttt tttttttttt 60 
ttccttcttt ttttcctttc tttttccctt 120 
127 



<210> 60 
<211> 183 
<212> DNA 

<213> Hepatitis C virus 
<400> 60 

tgaaggttgg ggtaaacact ccggcctctt 
tttttttttt tctttttttt tttctttcct 
ctttaatggt ggctccatct tagccctagt 
cat 



aagccatttc ctgttttttt tttttttttt 60 
ttccttcttt ttttcctttc tttttccctt 120 
cacggctagc tgtgaaaggt ccgtgagccg 180 
183 



<210> 61 
<211> 52 
<212> DNA 

<213> Hepatitis C virus 
<400> 61 

tgagccgcat gactgcagag agtgctgata ctggcctctc tgcagatcat gt 



<210> 62 
<211> 105 
<212> DNA 

<213> Hepatitis C virus 
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<400> 62 

tgaaattggt ggctccatct tagccctagt cacggctagc tgtgaaaggt ccgtgagccg 60 
catgactgca gagagtgctg atactggcct ctctgcagat catgt 105 



<210> 63 
<211> 176 
<212> DNA 
<213> Hepatitis 



<400> 63 

tgaaggttgg ggtaaacact ccggcctctt 
tttttttttt tctttttttt tttctttcct 
ctttaatgcc gcatgactgc agagagtgct 



aagccatttc ctgttttttt tttttttttt 60 
ttccttcttt ttttcctttc tttttccctt 120 
gatactggcc tctctgcaga tcatgt 176 



<210> 64 
<211> 200 
<212> DNA 

<213> Hepatitis C virus 
<400> 64 

tgacttaagc catttcctgt tttttttttt 
tttcctttcc ttcttttttt cctttctttt 
cctagtcacg gctagctgtg aaaggtccgt 
tggcctctct gcagatcatg 



tttttttttt tttttttctt tttttttttc 60 
tcccttcttt aatggtggct ccatcttagc 12 0 
gagccgcatg actgcagaga gtgctgatac 18 0 
200 



<210> 65 
<211> 144 
<212> DNA 

<213> Hepatitis C virus 



<400> 65 

tgaaggttgg ggtaaacact ccggcctctt aagccatttc ctgaatggtg gctccatctt 60 
agccctagtc acggctagct gtgaaaggtc cgtgagccgc atgactgcag agagtgctga 12 0 
tactggcctc tctgcagatc atgt 144 
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